I am happy with the internal OCR of DEVONthink 3.
What I am trying to do is to find the most likely correct document date in the content and use it to rename the file.
I tried smart rules in DEVONthink and in Hazel which turned out to be ok but caused hiccups every now and then.
Now I like to do something like this:
Scan the top right quarter of the document’s first page and extract the the text (and then let DEVONthink or Hazel identify the containing Date(s).
Since Fine Reader Pro is able to use saved area templates on imported documents this somehow should be possible. It also can extract the content from the selected area without updating the previously by DEVONthink performed OCR.
There are no functions in DEVONthink’s commands to scan a particular section of a page.
Also, what you are looking at it not a representation of the underlying text, i.e., what you think of as being in a corner of the document is not “in the corner” of the underlying text.
I am referring to the script of @Silverstone and therefore FineReader which as I mentioned has the ability to only scan certain areas. These areas can also be saved as “layouts” and therefore might be utilised in a script.
Actually I recognised that his is not even necessary. If the file is exported as text file in FineReader the actual first date in the scanned letter is also the first date in the text (which is not the case in PDF due to inconsistent orders of different areas).
How can I rename the PDF version of the scanned letter using the first date found in the txt version?
Hazel for example can be told to identify the first date whereas DEVONthink is able to identify oldest, newest and document date (no idea of the process which also is not perfect).
With this rule you can use of the identified date to rename the pdf:
To stay on topic and once more refer to the script of @Silverstone, is it possible to make the script export a pdf and a txt?
For now I do this with an Automator folder action.
You mean along with PDF to export also a TXT version of a file? Yes, it is possible, you’ll need to add one more export command - export to txt with some different parameters and change the outgoing path outPath to txt file.
I’m a beginner and just copied Silverstone’s Script and followed the instruction and then applied the rule and ran the rule in DT.
Now I getting a the message from DevonThink. Finder got an error: Handler can’t handle objects of this class.
Everything works good before as I see FineReader opening and going through the process of recognizing the document and then saving the document then I get the message.
Hi, I had the same error. Make sure temp path is correct
i.e. in the line
set outPath to "/Users/ilya/Documents/00_Temp/" & theName
“ilya” should be replaced with your username. Hope his helps.
I also have a quick question,
Script fails to move the old file in to the trash with these lines
move record theRecord to (trash group of database of theRecord)
set state of thePDF to true
I tried the original delete command as well but it still fails to delete the original file with the filename
“file.pdf_old.pdf”. Any ideas where I am going wrong?
well you can try one last thing, just make a temp folder in the desktop and copy it’s full path and replace the one in the line with this new folder path
set outPath to "/Users/ilya/Documents/00_Temp/" & theName
to
set outPath to "replace full path of the folder you created" & theName
It is hard to say where exactly the error is, not seeing it. You may want to localize this error, turn off some lines of code (type “- -” two dashes without space before this line) e.g. try to turn off the string tell application "Finder" to delete outPath as POSIX file and see if the error persists.
What else to try:
You need to allow DEVONthink access in Full Disk Access and Automation in System Preferences > Security & Privacy > Privacry
@lande80 had the same problem above, and nailed it somehow, not sure what he did exactly, so you may ask him.
Be sure the Temp folder you entered exists, it must end with “/” (in the code, but not in the name of the folder)
Two general things to know:
If you modify the Smart Rule script, don’t forget to reload DT after you saved the script, otherwise the changes may not apply.
To avoid naming problems it is better to untick the option “Show filename extentions” in Preferences - General/Appearance.
It looks like the error is somewhere in this try block. In this case script ignores all the commands in this try block after the error, including delete or move. Try to add a capturing on error block, replace end try (right after delete or move) of this block with e.g.:
on error error_message number error_number
if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
end try
It will generate a dialog with the description of the error caught, so you could understand what you can do.