Copying text from PDF

In previous version of DevonThink when I selected a file and hit command-C the content of the file was copied to the clipboard. Now (in Dt Pro) it copies the filename to the clipboard. I am moving the contents to a searchable MySQL database and there is no longer an easy way to do this. Is there any way to get the previous functionality back? I don’t want to export each file to a text file. Thanks,

David

EDIT: Looks like it’s a Panther vs. Tiger thing. It still works perfectly in Tiger. What happened in Panther?

David:

Apple made changes to Coco text and added PDFKit functionality in Tiger.

So DT Pro will “behave” differently in Panther 10.3.9 and Tiger 10.4.2. Some of the new functionality in DT Pro 1.0.2 needs hooks to OS X 10.4.2.

Hi again,

Actually it looks like the version 1.02 kills that functionality. I hadn’t updated and when I did on my Tiger box, the content no longer copies out of DEVONthink. Can you please put this functionality back? In the meantime, how can I get the previous version of DEVONthink Pro?

David

I found the version 1.01 disk image on my desktop (whew!). I just want to confirm that reverting to 1.01 allows me to copy content to the clipboard. Thanks,

David

Will this functionality be put back in a future point release? Or should I continue to use 1.01? Thanks!

David

When you select a pdf file in Dt Pro 1.02 and hit “Copy” the text on the clipboard is the file name.

When you select a pdf file in Dt Pro 1.01 and hit “Copy” the text on the clipboard is the text contents of the pdf files.

I need the latter behaviour so that I can get the contents into a database for online searching. I am currently forced to continue using Dt Pro 1.01

Can you please look into this issue and see if you can’t change the behaviour in the next release to what it was in 1.01? I am using MacOSX 10.4.5

Would you consider creating an Automator action that would select the text from PDF records in Dt Pro much like it can do the URL or Path currently?

Thanks,
David

David:

You will be much better off using one of the recommended methods for importing PDF files. Your “copy” method is not reliable, especially on large PDF files.

If you wish to capture text and leave the PDF located external to your database, use File > Index.

For File > Import > Files & Folders import, set DT Pro Preferences > PDF & PS so that the checked item for Index & Convert is “Use PDFKit (Tiger)”, as you are running OS X 10.4.5. Note that you still have the option of leaving the PDF file external to the database, or copying it into your database or the database Files folder.

You will find the resulting PDF documents much easier to read, with the added advantage that you can, using “Launch Path”, open the PDF file for editing, such as annotations. In the upcoming DT Pro 1.1, you will find that file synchronization using the Index import method will let you launch a file, edit it and see the changes incorporated into your DT Pro database. That will also be true for PDF files imported into the database Files folder, when you use Launch Path to open the file under its designated application, make changes, and save the file.

Hi Bill,

I guess I wasn’t clear. I do use the import function to get the pdfs into Dt. What was very handy in the past (prior to 1.02) was to just select the file name and Copy would give me access to all the plain text (OCRed or otherwise) that I could then paste into a “contents” field for searching from a web application (not currently available from Dt). In 1.02 there a couple of additional steps to achieve the same result: I have to select the file name, then click inside the pdf document that is displayed, then “Select All” and then hit copy. Since my solution is all scripted, the extra steps increase the chance for error quite dramatically. Anyway, I can use 1.01 for this particular script, but was hoping that the current behaviour was an oversight. Thanks for your attention, as always,

David

The current behaviour is actually intented. But as your solution is scripted, you should be able to retrieve the text without any user interaction:


tell application "DEVONthink Pro"
	set theSelection to the selection
	if (count of theSelection) is 1 then
		set theText to plain text of item 1 of theSelection
	end if
end tell

Hi Christian,

Thanks for this AppleScript. I added the line:

	set the clipboard to theText

I can paste the contents of the clipboard into BBEdit but I cannot paste them into FileMaker. The option to paste is greyed out. Do you have any idea why an application might not recognize that the clipboard has contents?

Thanks,

David

David,

difficult to tell. Maybe…

set the clipboard to (theText as string)

…or…

set the clipboard to (theText as unicode text)

…will work.