Splitting imported documents based on "SEPARATOR-PAGES"

Georg_Hermann · March 17, 2020, 6:47pm

To scan a bunch of different documents it would be great if Devonthink would be able to split documents using “seperation-pages” (with special barcode or special text…) which could inserted into the stack before scanning. Is something like that possible? That would bring workflows and productivity to a next level.
In one sentence: If the content of a page is just “SEPARATOR-PAGE” than discard that page, save all previous pages and start a new document.

So the workflow would be:

Scan to Network Folder
Import to Devonthink and doing OCR
Splitting imported documents based on “SEPARATOR-PAGES”
1, 2 is no Problem I am looking for step 3

Alternative:
A linux-tool which can be integrated into the workflow would also work:

Scan to Network Folder
OCR with “ocrmypdf-auto”
Splitting tool which looks for “SEPARATOR-PAGES”
Devonthink
1, 2, 4 is no Problem I am looking for 3

Anny ideas?

aedwards · March 18, 2020, 11:04am

The Queue Mode in scanner module already supports splitting documents with blank separation pages.

Georg_Hermann · March 18, 2020, 11:21am

Yes, but using blank pages as seperation pages is not meaningfully possible.
There are multipage documents where some pages are duplex (text on both sides) and some pages have text only on one side. So these documents would be seperated in a wrong way.

aedwards · March 18, 2020, 12:57pm

What are you looking to use as a separator page?

Georg_Hermann · March 18, 2020, 2:22pm

Something like the attached samples. An orientation-failproof-two-side-page.

separation-page.pdf (13.8 KB) separation-page_sample2.pdf (46.6 KB)

Sample 1 preferred

aedwards · March 18, 2020, 2:27pm

Thanks for the sample pages. I will look into how we could support different styles of separation pages for a future update.