I have several PDF files, scanned with a Canon 8800F and Image Capture in B/W mode 300 dpi - on a Mac without Devonthink Pro Office. The file size is between 7 and 110 MB, the files contain of 10 to 160 pages DIN A 4. When I import these files in DTPO to make them searchable, the read process takes forever, without any success. After two hours I got a message that the HD was full. Before there was 19 GB empty space, now 150 MB. I cancelled the import, and after a restart I have the 19 GB back.
Why does the import not work?
I opened the files with Acrobat Pro, the OCR worked fine (the OCR quality is not as good as normally with DTPO-Iris).
I wouldn’t say that your disk is filled with garbage but with temporary files that are used during the OCR process and will be removed afterwards. Now OCR is not something that is easy and for some documents it may take a long time before it can be finished. Even pages that in my humble opinion are simple sometimes take a lot of time. My guess is that you selected a collection of files for OCR and the first one in the list causes a problem. You could try to split that file into smaller ones (Preview allows you to do this) and afterwards use Preview again to glue them together. Other than that we can only contact IRIS to report a potential bug. And in that case I would ask you to send an email to email@example.com with as much information as possible so we can inform IRIS.