DT3 - searchable PDF

nano5 · December 14, 2019, 4:50am

Occasionally DT identify some PDF vector documents, as PDFs (not searchable).

If you try OCR such PDF vector document into searchable, the result is rough. any idea?

BLUEFROG · December 14, 2019, 4:46pm

If you are doing OCR on a document with text, it will generate an image of the text, hence the ragged appearance. However, you shouldn’t run OCR on a document if it doesn’t need it.

Try opening the PDF in Preview (via control-click > Open With), then holding the Option key and choosing File > Save As. Set Quartz Filter to Create Generic PDFX-3 Document and overwrite the file. After it overwrites, check it in DEVONthink again.

nano5 · December 14, 2019, 11:54pm

Thanks for the suggestion.

Try two PDFs with the steps - replace the current ones with generic PDFX-3 via Preview, and DT still thinks the same, DT logs enclosed.

BLUEFROG · December 15, 2019, 2:13pm

Hold the Option key and choose Help > Report bug to start a support ticket and attach the PDF in question. Thanks.

nano5 · January 18, 2020, 5:01am

Sorry for this late reply - but the issue is too obvious.

Almost 10% of the previous indexed PDF document (vector/true) under DT 3.0.1, when reindexing under DT 3.0.3, will become non-searchable, such as this one

BLUEFROG · January 18, 2020, 2:41pm

Did you start a support ticket, per my instructions?

cgrunenberg · January 20, 2020, 8:38am

Did you modify the files in the meantime or did you upgrade macOS?

nano5 · January 20, 2020, 1:02pm

The files were not modified I think, but macOS could have been upgraded during now and then - my macOS now 10.15.2 which was released early December.

cgrunenberg · January 20, 2020, 3:07pm

Could you send such a document to cgrunenberg - at - devon-technologies.com? Then we could check this on both Catalina and Mojave. Thank you.