double-page spread scans and other problems

MacBook, Canon LiDe 90 (scanner)
Os X 10.5, DT Pro Office, Readiris 11, Acrobat Professional 8.0


Last time asking for help I received serious answers. Please help me again to solve my problems or to find ways to avoid them. As I am a quite new DT user, I would be very happy. I still work with a test database because I still have problems I want to get rid of before creating my personal database.

Problems:

First of all, it is not possible to scan directly into DT, because there is always I message informing me that it does not work with Os X 10.5.

I have many Pdf files done with Acrobat 8.0 and I always saved them as just compatible with Acrobat 8.0 to get smaller files. As an academic I have many long texts that need a lot of space on my harddrive. But these files are not compatible with DT. In DT they appear as grey pages (it is the same with Readiris by the way). So I have to open each file with Acrobat and save it as an Pdf compatible with Acrobat 5.0. This is the latest Pdf format that DT is able to show me as it looks in Acrobat.

I want to import some double-page spread scans to DT and use OCR to turn them into searchabel Pdf files, but it is not possible. An error always occures during the OCR process. But there are no problems with usual documents.
As an academic I have to scan whole books from time to time and I do double-page spread scans because it takes me half the time and the result looks much more “natural” (like reading a book). So I open the scanned Pdf files in Readiris and turn them searchable. Everything is fine after that in Acrobat: I can mark some lines on the left side for example and Acrobat ignores the right side unless I reach the buttom of the left side. But in DT it does not work with the very same file. Whenever I mark a line it is a line from all left to all right ignoring that the text has two pages (two columns). This is a real problem for me. I have not found a way out so far.

After that I want to cut my double-page spread scans with Acrobat in two halves to have just one-page spread text. But what setting I use in Acrobat to do it, it just changes the way of appearence in Acrobat itself. That means it just hides some parts of the scans but whenever I import them afterwards to DT, the whole scans are visible again. This is not a original DT problem - I know - but I would be so glad to receive some suggestions. Is there a program that is able to cut Pdf pages as Photoshop can crop parts of a page. I do not want to open each Pdf in Photoshop because it takes to much time when the Pdf has many pages.

These are my problems. Hopefully, I could explain them appropriate although English is a foreign language for me.

Thank you so much
Mathias

Make certain that you are using the current version of DT Pro Office, which is 1.5. Also, the current version of Leopard is 1.5.1. For your Canon scanner, check to see if there’s a version of CanoScan ToolBox on Canon’s Web site, which should allow you to use and configure the PDF button in the ToolBox software to send output directly to DT Pro Office.

I don’t have your scanner model, but I have a CanoScan LIDE 500F and CanoScan ToolBox works well to send output directly to DT Pro Office for OCR and storage to the database.

When Acrobat 8 was introduced it wasn’t entirely compatible with Mac OS X, so I’ve stuck with Acrobat 7. I haven’t checked to see whether Adobe has put out a newer version of Acrobat 8. You will likely have to continue to save PDFs as version 5 or older to retain full compatibility with OS X, Preview and DT Pro Office.

The solution to hard disk file sizes is probably to get a larger hard drive. Larger hard drives have become inexpensive, and much cheaper than the time you might spend trying to make smaller PDF files :slight_smile:

DT Pro uses Apple’s PDFKit to render PDFs. So in DT Pro or in Preview you can draw a text box to select text in multi-column PDFs. That will solve your problem.

To select text in this way, hold down the Option key while making the selection.

I don’t think you will find any satisfactory means of “splitting” a two-page scan, especially for multi-page scans.

Thank you Bill for your answer. I am really amazed that I always receive an answer quite soon in this forum.

Just to give you a feed-back:

to problem 1:
The way you describe it, everything is fine. There are no problems with scanning it and tell the scanner support program to send the scan to DT without asking me again… But: Whenever I try to import something from my scanner to DT, it does not work. When I go in DT to “import” and than to “from scanner” a nice scanner pop up window appears. DT connects to my scanner, but whenever I press the buttom “scan”, the message appears that this is not possible with Os X 10.5.

to problem 2:
You are all right. I just wanted my database that does not exist so far not to become too big. But I guess some gigabites are no problem, yes?

to problem 3:
Thank you very much for the hint with the textbox. That really helps me in many cases. But what I really wanted to do is to transform the Pdf-files in DT into text-files. And this is what does not work with my double-page spread scans. *** OH ! I have just tried to convert one into a text-file and the text-file is wonderful. Always the left column first and than the right one and … *** Strange, now it works. The only thing is that I will have to use Readiris to do the OCR process with my double-page spread scans, as DT cannot make these scans searchabel. Or am I wrong?

to problem 4:
Thank you for letting me know.

Thank you again for your help!

By the way: I am waiting for the 3rd and 4th part of the “Series Computer Aided Research” in DEVONacademy to learn more about how I could keep my bibliographic information directly in DT and how to export it again. I do not want to use a special bibliographic software.

Mathias

That’s exactly the reason why it doesn’t work. There is a bug in OS X 10.5 that prevents the scanner code from working. Either Apple fixes this in an upcoming OS release (they are aware of the problem), or we will have found an alternate solution. Both of these take time. We’re terribly sorry for this but fortunately as a workaround we also support OCR when receiving files from Image Capture directly or several other scanner programs.

Could you elaborate on this statement:

It has now been 3 months since Leopard was released, and my other scan programs seem to have no problem (SilverFast, VueScan), and none occurred when switching to Leopard (SoHo Notes works fine as well).

The main reason I’m considering DT Pro Office is the scan ability - it’s hard to make the plunge when there is no indication that this is being resolved.

Can you elaborate on what the OSX 10.5 bug is?

– Len

That’s because all these other programs avoid using the Image Capture framework from Apple (a smart choice as it turns out). All programs that do use it (such as NoteTaker for instance) don’t work on Leopard.

We’re trying to find other solutions. These cost time because it means we may have to rewrite this from scratch. We’re not Microsoft nor Adobe but we’re crossing our fingers that Apple has made a fix for us in 10.5.2…

As I’ve often noted, I don’t have this problem, as I don’t need to control either of my scanners from the DT Pro Office side. Both scanners have driver software that can directly send PDF output to DT Pro Office for OCR processing and storage. The drivers let me specify resolution, black & white or color, etc.

I have a Fujitsu ScanSnap, with ScanSnap Manager as the driver software, and a CanoScan LIDE 500F, with Canon’s ToolBox driver software. They work under Leopard as well as under Tiger. Both give a choice of single page PDF per scan, or allow multipage PDF document creation.

So if you have driver software that’s compatible with your scanner and can be configured to send PDF output to a designated application, you will be home free.

But if you don’t, the current kludge is to save scanner output to the Finder, then use File > Import > Images (with OCR) to OCR the scan results and store them into your database.

How do you configure the ScanSnap Manager to save to DTP? I have been sending to a separate folder and then importing to DTP which is almost as time consuming as scanning each piece of paper separately.posting.php?mode=reply&f=3&t=5772#

You can add a new application to the application list. If that application is DTP the scan will be sent to and imported in our application.