Smart Rule help: Delete original document after import and OCR (DTP)

Hello all,
I’m using DEVONthink Pro 3 and trying to set up a Smart Rule that:

  1. scans a specific folder on a regular basis
  2. imports all contents to the Global Inbox
  3. applies OCR if the document is an image or PDF, and
  4. deletes the original document from the specified folder.

Is this possible? I’ve worked out the first 3 steps, but how to do step 4 eludes me. A search of the forums didn’t yield any clues. Nor did a search of the Help files.

I’ve attached a screenshot of the current Smart Rule. Any help is greatly appreciated, I’m rather out of my depth with many of Devonthink’s abilities.

Screen Shot 2021-05-15 at 5.08.14 pm

Just to be sure I understand you correctly:

When you say “scans a specific folder on a regular basis” is that folder indexed? So in the specific example you are using, -DEVONthink is an indexed folder? The Move into Database action applies to indexed files and removes the file from the folder in the file system to your database.

I’m a little confused, because the rule you posted doesn’t scan a folder and move all its contents to the inbox (as per your points 1 & 2). If you use the OCR Apply action (as opposed to the OCR to action), the original is replaced by the OCRd file (and so you wouldn’t need to delete the original), so again, I don’t understand what you are trying to achieve. Please could you explain you steps in more detail?

When you say “scans a specific folder on a regular basis” is that folder indexed? So in the specific example you are using, -DEVONthink is an indexed folder? The Move into Database action applies to indexed files and removes the file from the folder in the file system to your database.

The folder is a temporary holding container for files that need to be imported into DT. So ideally it’s importing everything that gets put in there, not indexing.

I’m a little confused, because the rule you posted doesn’t scan a folder and move all its contents to the inbox (as per your points 1 & 2).

That would probably be part of why I can’t quite make it work, then :man_facepalming: Though the vast majority of things I need to import are either images or PDFs, and I need them OCRed, so that’s the main thrust of this exercise.

If you use the OCR Apply action (as opposed to the OCR to action), the original is replaced by the OCRd file (and so you wouldn’t need to delete the original), so again, I don’t understand what you are trying to achieve. Please could you explain you steps in more detail?

Ah, I must have misread the help file on that one. Thank you for explaining the difference!

My ideal flow for this Smart Rule is:

  1. Files are added to -DEVONthink folder
  2. DT imports the files to the Global Inbox, deleting the original from -DEVONthink
  3. Apply OCR to PDFs and images that need it

Re. 1: the correct terminology is group if -DEVONthink is a “folder” in a database (rather than a folder in the file system). If it is a folder in the file system rather than in DT (again, I’m confused; your first post shows it the be a group in DT, but in this post you talk of importing to DT).
Re. 2: if the items are already in a DEVONthink database then they aren’t imported to the global inbox, but simply moved there.

So, we can achieve what you are looking to do with 2 smart rules; both use the on import trigger. The first uses -DEVONthink as its search location, acts on e.g. Kind is any document and uses the “Move to inbox” action. The second (which must be below the first in the list of smart rules) uses (global) inbox as its search location, All of (Word count 0, Any of (Kind is PDF, Kind is Image)) and the OCR Apply action. The first rule must not, and the second may use the Cancel action as the last action (this depends on whether you want additional rules to act on these files). Edit: see below, the rules need to be set up differently)

If -DEVONthink is a folder in the file system from which you need to import, then in addition to the two rules you need to index that folder in DEVONthink. In that case you do also need the Move into Database action in the first rule.

If -DEVONthink is not a folder in the file system but an (unindexed) group in DT, how are the items getting into that group, and why don’t they simply go directly to the inbox?

So, too many words in the last post. In this post I am going to assume you have a folder in your file system called “-DEVONthink” and that you are dropping files in there; you want DT to recognise that there are new files there, import them into DT and deal with them.

If you haven’t already done so, you need to index that folder; to do so, in DT open the location you want the folder to index to from the sidebar and then select *Index Files and Folders…" from the File menu. Select the folder in the file system. Now “-DEVONthink” will appear as a group in DT; it will reflect any changes undertaken to the folder of the same name in the file system.

Now, set up two smart rules (Edit: this is a change from my post above; the second did not trigger, so we need to add OCR to the first rule, and split things between files which you want OCRd and those you don’t want OCRd):

You can actually make the second rule simpler if it is placed below the first in the list of rules; because OCR will have been performed on any which conform to rule 1, Kind is Any Document would be sufficient to move anything else with the second rule.

Thank you so much for taking the time to explain everything! I followed your screenshots and everything seems to be working perfectly.

The -DEVONthink folder is indeed a folder in the main file system, that I had then indexed to DT. (Apologies for not mentioning that in my previous posts. I had forgotten that I’d done that.)

As to why things aren’t going directly to the inbox - there are several websites that I need to download PDFs (eg receipts) from on a regular basis. For some reason, they won’t play nicely with anything other than a direct download to the file system (sometimes not even then; thanks, government! :roll_eyes: ).

I completely missed “Move into Database” in the action list when I was trying to nut this out. No wonder nothing was working how I expected it to!

1 Like

You’re most welcome! You had basically cracked it yourself, anyway :slight_smile: What you have gained is a better understanding of what each step does, which is all part of getting to know DEVONthink :slight_smile:

For posterity: my original intention was for smart rule 1 to move all items into the database and then move them to inbox, and smart rule 2 to OCR image and PDF-items with a word count of 0. I ran into the problem that smart rule 2 was not triggered (presumably the on import trigger is specific to a location, and moving files obviously didn’t trigger it again). Two possible solutions: use the on moving trigger (although I’m not sure whether the search needs to incorporate the yielding or the receiving group - or either); or trigger the on import trigger again by adding the following embedded script as the last action to the first rule:

on performSmartRule(theRecords)
	tell application id "DNtp"
		repeat with theRecord in theRecords
			perform smart rule trigger import event record theRecord
		end repeat
	end tell
end performSmartRule