OCR & Hazel Workflow on Finder before importing into DT

You can do all the OCR actions that you can within DT 3 so:
Annotate Document - will place the OCR’d text in the Annotation field of a record
Comment document - will place the OCR’d text in the Comment field of a record
Word Document - unsurprisingly will generate a word document

Thanks for clarifying. I actually never saw the “Word” option in the convert menu, because I do not touch MS Office with a ten foot pole. Therefore, I assumed that the option to OCR to a word file in the scripting dictionary was erroneous.

On annotation/comment: I’m not sure that I fully understand what they’re supposed to do. If I OCR a (for example) TIFF file (which is outside of DT at that moment) to a “PDF document”, I get a PDF document as a new record in my global inbox (not surprisingly). However, what do I get if I OCR an external file to “annotation document”? A new, empty record with only its annotation set to the OCR result?

In DT itself (or with the convert image method), the OCR result would then go to the annotation of the OCR’d record, I guess. But with the ocr method, there is no record yet.

In the first case it would the tiff file will be imported as an image record with its annotation or comments field containing the OCR’d text, With convert, you are correct that it would add it to that record.

Actually, I don’t work on the scripting dictionary. :smiley:

1 Like

Hello everyone,
I have to be honest: I am lost :slight_smile:

I do not know how to setup a folder action with apple script and the ocr engine of DevonThink. Would anyone be able to help me with this script?

Many many thanks.

Where do you find this?

Script Editor (which is part of MacOS), file/ function library, select DEVONthink.

1 Like

Hello, I’d like to bring this up again.

I’d like my documents to ocr before importing them into devonthink, so I can use my hazel-rules for renaming on finder-base. Is it possible to use the ocr-engine outside of devonthink and script it into an hazel-rule?

Yes. Use

const dt = Application('DEVONthink 3');
dt.ocr(...)

for JavaScript. AppleScript is similar, using a tell block.

I do it this way.

ah cool. But this is probably not the whole script block I need to include into hazel to work, right? Could you show me what the whole scripting block should look like?
Many thanks

You are assuming @chrillek is also using Hazel.

1 Like

I don’t know. That depends on what you’re doing when and where in Hazel. And as @BLUEFROG hinted at, I’m not using Hazel (anymore) for integration with DT. Instead, I wrote my own script to do everything I’d done with Hazel before.

Ah I see. I actually thought youn were using hazel. but thanks for the link, I will dive into it on the weekend.

I thought of another way. I index a folder in DevonThink, where those files will be moved that need OCR. in DevonThink I could create a rule to automatically OCR those files and will be put into the same folder, from where hazel could then take over in finder-base.
The only thing I have to think of: how could I get the original non-ocr files out of the indexed folder after they have been ocred? otherwise I would create an unlimited workflow inside this indexed folder. :nerd_face:

how could I get the original non-ocr files out of the indexed folder after they have been ocred?

You don’t have to if you use the OCR > Apply action and also make sure DEVONthink’s Preferences > OCR > Original Document: Move to Trash is enabled.

I already tried, but the original file still remained inside the indexed folder after the intelligent rule has found and ocred the document.
I needed to include an DELETE action into the rule after the ocr.

Post a screen capture of the smart rule.

here we go.
using the german version (obviously :grin:)
Bildschirmfoto 2023-03-15 um 08.16.27

Just use OCR > Apply. This won’t create a new document and therefore the Delete action isn’t necessary.

1 Like

Criss reiterated what I said previously. Use OCR > Apply.

1 Like