Automate documents

What is the best way auto automate Documents?

I want to manage Invoices, Contracts, Insurances, Financials with DT and like to:

  • Tag Documents by type e.g. “Invoice”
  • Name Documents with the document date in front of name
  • sort documents into a Folder by sender, e.g. “Amazon”

Is this possible without creating a lot of Smart-Rules?

Welcome @Carsten221
It’s impossible to say how many smart rules would be required.

  • Vendor to vendor, your data is surely not going to be very uniform, e.g., the content and positions of content on a page. This is compounded by documents like PDF are constructed differently than what you see on screen.

  • It also depends on the technology involved. Are you using built-in smart actions, like Scan Text or scripting, etc., If you’re going to use AI, are you comfortable with sending document text to a commercial provider online? If not, do you have the hardware and time to allow local models to try and parse things. And are you willing to iteratively fine-tune your prompts and later spot-check documents to ensure things are working?

Those few things being said, is some level of automation possible? Sure, but a 100%, transparent and accurate, hands-free setup? That could be a much harder thing to build and maintain. But there are certainly possibilities.

I use an Applescript to assist with processing Inbox records
This includes filing, name, tags etc.

*Tag Documents by type e.g. “Invoice”… * sort documents into a Folder by sender, e.g. “Amazon”

My process is to use tags; Type-Invoice, Vendor-Amazon
minimal folders

1 Like

Replying to myself…

Here is a fun little construction that honestly took about 10 minutes of messing around. Granted, it’s a stacked deck in a small way but certainly could be used practically…

2 Likes

Thanks @BLUEFROG for response

lLike you do in the video it looks nice. But how to do it?

About “many Vendors” and “Many Smart Rules” - I though there is some kind of AI what could to this. Extract the vendor name from some document. At least after learning it from a first document, by manual assignment. This AI then could automatically construct these smart-rules…

About “Technology involved” - I don’t want to do something complicated. I just want to file all my (mostly PDF) documents in a consistent way so I can find it if needed, and have good overview it is “complete” and nothing is missing, e.g. Phone Bill Jan, Feb, Mar, April … are all in same folder so I can list the folder and see if some month is missing or not, therefore i don’t like the “chaotic” organization and like more structure.

I have something in mind like “Paperless-NGX” is doing… but I don’t want to host a web-service by myself.

I think DT is really powerful but it has also some complexity..

Yes, you can make it complex, and if you want to use AI it will probably get even more complex. What you expect to happen is probably a so-called “wicked” problem. I don’t want to dissuade you on your journey, but temper expectations about it being “simple”. It’s not the tools, it’s the problem.

2 Likes

There have been numerous threads about that topic. Most of them solved the problem without an LLM, just using smart rules and/or scripts. If you search the forum, you’ll find many useful ideas.

2 Likes

thanks I try to find some solution. A bit problematic is my DT is running with German language but the Forum and Documentation is all English.

I have problems to find the right settings for a smart-rule to use regex…

Is it possible to switch DT to English without switching the whole MacOS?

  1. Quit DEVONthink.
  2. Go into System Settings > General > Language & Region > Applications.
  3. Add DEVONthink and choose English from the dropdown.

PS: Here is the simple smart rule I used…

Notes:

  • The simple regular expressions but also be ware you’d need to add the appropriate business names to the second Scan Text action, e.g., when you deal with a new vendor/company.

  • (?i) could (should) be used in both regexes, just to cover Walmart, WALMART, or even wALmarT :wink:

  • Item is not Tagged and Word Count is greater than or equal to 3 are assumptions, the second being there is a text layer with at least some text on it.

  • The Comment field is being used as a variable so it can be used in the later Change Name action. This whole construction is to show the use of built in actions and properties that don’t necessarily require someone to know how to script.

And if you want to have a play about with it…
Detect, Timestamp, and File.dtSmartRule.zip (1.4 KB)

3 Likes

thanks, I did not know its possible to switch language for a single app, that’s great!

Now I see my mistake: I did not know that this “Regular Expression” behind “scan text” is an Action and not in the filter criteria above.

thanks for the screenshot and your help.

It depends on what localizations an application has, if any.

And you’re very welcome. It’s simple but can be extended or built upon quite easily I’d say.

There’s also a German forum here:

but it has less traffic.

1 Like