How to alter settings in DT3 using AppleScript?

Certain documents I import have a background picture on the front page. With either of OCR > Deskew or OCR > Page Orientation enabled, DT3 occasionally gets this very wrong (one page was scanned correctly, but DT3 decided not to deskew it by the document edges, or the baseline of the text, but preferred to ensure that a knife by a plate in the background was perfectly vertical!)

These documents have a specific filename format, as I’m already using an On Import rule to tag and file them correctly. I’d love to be able to use AppleScript as part of this rule to turn off both these settings before the document is OCRed, and preferably to be able to turn them on again afterwards.

Alternatively, I could allocate a button on the Stream Deck to toggle these settings if it’s scriptable. I don’t really want to do it by making the Settings window appear, and then having it get clicked through and closed again; I’d rather not have random windows flashing on my screen every time I scan these documents.

Is it possible to control DT3’s settings through AppleScript? I can’t find anything related to this in the dictionary.

I can’t even correct this manually, because DT3 only allows rotating pages in steps of 90° — does DT4 allow finer control of rotation, preferably in steps of 0.1°? Exporting the page as an image, rotating it in Photos, re-importing it to the PDF, and re-OCRing is not a viable solution :grin:

The preferences are not directly accessible to scripting. You can try UI scripting, but that is generally not encouraged.

Thanks — it’s possible that the easiest fix would be to manually straighten these pages, if only DT would allow finer-grade rotation of pages?

Is there a way to rotate pages by small amounts without exporting as an image, rotating, re-importing the image, then re-OCRing? (Or paying Adobe’s ridiculous price for Acrobat Pro…)

It is possible with scripting. But that is messy, IIRC. And it might be possible with some PDF tools, though I have no experience with those.

Is the OCR wrong because of the erroneous de-skewing? Or are you just after a visually more pleasing effect?

Nope. Rotating by 90° is easy everywhere. Everything else, not so much.

But that is what every tool will basically do, because it is the simplest approach: convert a PDF page to a raster image, rotate that raster image, and re-import it as a PDF.
The alternative would be to define a transformation matrix and prepend it to the whole page – which is perhaps what Acrobat does, but I doubt that any cheap(er) tool would go down that rabbit hole.

Aside: The crop tool of Affinity Photo 2 allows you to manually rotate a PDF page. This one is considerably cheaper than Adobe’s tools. But the process is probably the same one I just described.

1 Like

The OCR for that page seems to be correct; it just that the page is now about 30° off vertical, which makes it difficult to read on screen.

I want to have OCRed text — but I want to be able to read the text as well :grin:

What is the origin of these documents? Are you scanning them yourself?

If you are, it’s probably easier to fix this earlier in the process. I can recommend ScanTailor Advanced (open source). This build works on my Ventura machine:

https://github.com/yb85/scantailor-advanced-osx

It might be a bit obscure, but the results are impressive. I’ve used it to produce excellent PDFs from a few book scans done with a flatbed scanner.

Here’s a blog post showing how it works:

Or a detailed video guide on YouTube:

Yes, I’m scanning them myself, with a ScanSnap ix1500.

If I set ScanSnap Home to save to disk rather than passing directly to DT3, I can see that the PDFs are not skewed before import — it’s DT3 that is getting confused about how to deskew them on import.

This is what it looks like in Preview before importing to DT3:

but after importing, it looks like this:

…and that’s clearly worse than before. (Others were more skewed; this is the first original card I could find — though it appears I misremembered about it trying to make the cutlery vertical!)

It’s only about five or six cards out of 150, but it’s still a faff to have to pick them out, alter the settings, and rescan.

Binding this AppleScript to a Stream Deck key with Keyboard Maestro allows one-touch disabling of DT3’s processing, anyway (change 1 to 0 in the checkbox tests for the version that re-enables):

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

tell application "System Events" to tell process "DEVONthink 3"
	set frontmost to true
	delay 0.4
	keystroke "," using command down
	delay 0.4
	set thePrefsWindow to front window
	tell button "OCR" of toolbar 1 of thePrefsWindow to perform action "AXPress"
	set theDeskewCheckbox to checkbox "Deskew" of thePrefsWindow
	set theOrientationCheckbox to checkbox "Page orientation" of thePrefsWindow
	if value of theDeskewCheckbox is 1 then click theDeskewCheckbox
	if value of theOrientationCheckbox is 1 then click theOrientationCheckbox
end tell

If the ScanSnap can do OCR itself, you could turn it there and turn off the postprocessing including OCR completely in DT.

1 Like