Where did it go?

homebody · August 2, 2006, 2:22am

Just when I thought I was getting the hang of the workflow for dtpro, this hits me and I’m back to square one. In choosing to output the scan from the fujitsu scansnap to devontech pro, the scan goes fine but then I can’t find it when I go to the database. I don’t get an opportunity to name the newly created file/scan but even looking for it by date is no help. I guess I need someone to “hold my hand” and walk me through this…scanning docs with scansnap into devontech pro…especially if I want them to wind up as pdf’s that have been made text index-able with ocr…I know it’s probably staring me in the face and maybe I just need some time away from it but it seems that to scan docs into folders then import them into devontech pro isn’t really getting the full benefit of this program…

Bill_DeVille · August 2, 2006, 4:15am

If you selected DT Pro as the destination application in ScanSnap Manager, the image-only PDFs (no OCR) are being sent to the top level of your database.

As you don’t yet have the super-duper special version of DT with automatic OCR and transfer to your database, you will have to make do.

ScanSnap includes a copy of Acrobat 7, which does have OCR capabilities (although not as fast or accurate as those in the DT beta I’m using).

Suggestions:

[1] Send the scan results to a folder in the Finder. ScanSnap Manager’s default destination for images is the Pictures folder in your Home directory.

[2] Set Acrobat as the destination application.

[3] With a scanned PDF open in Acrobat, select Document > Recognize Text Using OCR. Save the file after OCR is complete. Then, if you wish, select File > Reduce File Size and save the smaller version of the PDF file. If you wish, you can rename the file in the Finder.

[4] Now in DT Pro, select File > Import > Files & Folders and select one or more of your scanned and OCR’d PDF files. Now you have scans that can be searched and analyzed by DT Pro (assuming they contain text).

Our beta software reduces this to two steps: Press the Scan button; enter a title for the imported and OCR’d PDF (this step can be skipped if desired).

homebody · August 2, 2006, 11:09am

thanks for the quick response…this should help …until the super duper special version is released.

Prion · September 29, 2006, 10:20am

Bill,

there are many variations of PDF one of which is particularly useful: searchable image. What it does is add an invisible layer of searchable text that the OCR produces but you are still loooking at the original image and so OCR mistakes and OCR-unfriendly stuff (symbols, equations, graphs etc) can still be viewed as normal.

Bottom line: Please make searchable image or similarly useful variations of PDFs available in your super-duper next version not the usual garbled OCR mess
When do you plan to relelase this?

Thanks
Prion

PS: It seems the folks at Devon can read my mind. I have started to wonder how my collection of pdfs (the recent ones are mostly searchable anyway, older ones are almost invariably image-only) could be converted into ultimately useful stuff and wondered if this could all be accomplished within DTPro. Way to go!

Bill_DeVille · September 29, 2006, 10:01pm

Prion, I fully agree with you that the PDF+Text output of OCR is the way to go. That’s what I’ve got set up with the beta software that automatically scans, runs OCR and then saves the PDF to my database. The results look just like the original paper version and the OCR accuracy from good copy is very good indeed.

I’ve recently switched to Papyrus 12 as my primary word processor for polishing material that I write in DT Pro. Reason: Papyrus 12 lets me set editable PDF as the saved file format. So I see a PDF+Text file when it’s imported into my database, looking exactly like the layout of my document. If I need to edit it, I just open that DT Pro document in Papyrus 12, edit and save. Next time the document is opened in DT Pro, I see the edit results.