I am using a Scansnap s1500m to scan several documents into devonthink through the Scansnap application. I usually do 50 - 60 pages for every file. Devonthink still had several documents to perform the ocr (The scanning was long done) when the application crashed (I used a script to get notes from skim to devonthink).
Do I have to rescan everything again? It willI be a pain to do it all over and I do not want to spend more time figuring out what is scanned and what is not. I would appreciate any help
If the scanning is done, and the files are in the database, you shouldn’t need to scan anything again. Any files in DEVONthink that have kind = “PDF+Text” have been OCRd. Files that have kind = “PDF” have not been OCRd and you’ll need to OCR those files. I’d suggest OCRing the files one-by-one. If DEVONthink freezes or crashes on a particular file, then get rid of it and scan just that document again.
The files that were not processed through Devonthink OCR are not in Devonthink.
I am using the “Scansnap Manager” application to get things into devonthink. All I have to do is press the scan button for this to happen:
- Documents are scanned.
- The “scanned files” are added to Devonthink’s OCR queue.
- Devonthink’s OCR activity window pops up showing me how many pages have been processed by the current queue and how many more files need to be processed.
- After the OCR is complete, I get another window that asks for a filename (the window also allows you to add tags, comments and other stuff).
- The file appears in the inbox.
I am assuming that scansnap manager produces some sort of temporal files that are then send to Devonthink. It could of course just do it all on RAM, but somehow I doubt it.
My question is:
Where are the “scanned files” placed before they are processed by Devonthink?
Where are the files placed? Check the Snapscan Manager’s settings, in the “save” tab. For me, the folder defined in the “save” panel is where they end up.