import of PDF failed

Dear Users,

I have searched the forum and note that this has come up before, but usually the solution has been idiosyncratic and thus not applicable.

So, I have tried to import two PDFS; both failed. This has never happened before. I imported several before and after. Any tips? They open in Adobe; I just cannot get them into DTPO.

I should note that when I look at DTPO’s inbox in the finder the two PDFs are there, but in the inbox when I am in DTPO they are not.

They are both slightly over 2gb if that’s useful info.

Many thanks,

Christian

4:25:15 PM: ~/Library/Application Support/DEVONthink Pro 2/Inbox/Curran Files Box 5.pdf Failed
4:25:15 PM: ~/Library/Application Support/DEVONthink Pro 2/Inbox/Curran Files Box 6.pdf Failed

Acrobat? Or Reader? What release of DEVONthink? What release of OS X? How much memory? Are you running DEVONthink in 32 bit mode? These are all important factors.

How long have you waited? Depending on the machine, memory, etc., it can take quite a while to get 2GB of data into the Inbox database, and from there to another location in DEVONthink.

Just curious, but why is it necessary to have massive PDFs in DEVONthink – the AI isn’t going to be of much value. Searching isn’t going to be any faster (probably slower).

Hi,

Thanks for the reply. It’s a MacBook Pro with 4gb of memory running 10.8.3 with the latest version of DTPO. I don’t know if it’s 32 bit or not. I am using Acrobat Professional.

I have waited overnight to see if it works. I get the failed message in the log more or less right away, however.

I do not need the files to be so big. It is a 500 plus page PDF in which each page represents a JPEG. That is, I took 500+ JPEGs and converted the a single PDF. Each JPEG was b/w 2.5-4.5mb. I would be more than happy to reconvert them and make the PDF smaller.

Thanks,

Christian

If there isn’t a reason to have the 500 jpg in one document, and you still have them available, the perhaps put them into folder(s) in the file system and index those folders. If the jpgs are images of text, you could experiment OCRin some of them with DTPO (or Acrobat) to see how that goes. If successful, then perhaps consider OCRing them as individual pages. That would like get you the best benefit from DTPO’s search and AI features.

If the jpgs are not available, it is a trivial matter to have Acrobat professional break down the 500 page document into individual one-page files.

Importing two 2 GB files into DTPO on a machine with 4 GB memory will probably never work.

Thanks for your help. I think the best thing for me to do is to reduce the size of the original JPEGs and then recreate the PDF. I need them all in one file.

Christian

If your reason for doing this JPEG → PDF conversion is to create OCR’d text searchable documents might reducing the JPEG size decrease OCR accuracy?

It might; I really don’t know. I don’t use OCR as much to find words as I do to be able to highlight text.

Ahh, since text highlighting PDFs is normally easier and more interoperable than doing it with JPEGs it makes sense to pick PDF format for that purpose. Seems there are more advantages for you to use a single, paginated PDF document (even if large) than working with multiple, separate JPEG images.

Exactly! Having a continuous, scrolling PDF–for me–is the way to go. I have thousands of archival, textual documents (I’m a historian) converted from JPEG to PDF stored in DTPO.