I have a large number (few hundred, not thousands) of PDF documents in my Think DB. I am finding that in about 33.3% of cases, Think seems to be recognizing these as TIFF images: when I attempt to browse the file, I can see only the first page, and cannot advance to other pages. When I attempt to export, the file extension comes up as .tiff (even though .pdf is displayed in the DB notebook view). Curiously, when in the notebook view, Think lists the correct number of words for the document. In cases in which I have the original PDF saved, the PDF displays correctly in Preview.
I feel like I’ve lost a great deal of work if I have to sort through all these PDFs to determine which are not archived correctly, and download those which I no longer have saved copies of on disk.
Thank you for the pointers. I’m familiar with these issues and do not think this addresses my problem.
I am not, and wouldn’t like to begin, converting most of my PDF documents in the database to text or rich text, because they are typically mathematically intensive. I have left this option un-selected in the Preferences. Having the text of the PDFs indexed is interesting, but it is by far secondary to being able to read the PDFs themselves.
My problem is simply that the PDFs do not display correctly, perhaps because the document type is not recognized.
I would have assumed that the PDFs should display just as they do in Preview, because I had thought that Devon was using the same PDF rendering engine that OS X does.
Finally, if the suggestion is that I need to re-build my database according to convention X because PDF rendering does not work properly, I have to say that is extremely frustrating, and at the moment, I have zero time to do it.
Thanks for any advice on or clarification of these issues you can offer!
I think if you view a PDF file with the DEVONthink pdf browser then its ‘Export’ (on the toolbar at the TOP) is attempting indeed to export just the current page as a TIFF. But if you choose ‘Export’ from the File menu. at least in my experience, you get a PDF file exported.
Concerning being able to see only one page, are you sure there is no page button on the toolbar at the BOTTOM of the DT pdf browser window?
I have about 400 PDF files stored directly in DEVONthink. So far I have had no problem with them. Any I have needed to retrieve as stand-alone PDF files, outside DT, have exported (via the ‘File’ menu) fine.
Do you have the Images & PDF -> Copy Files preference set to Copy files to database folder or to Copy files into database? Copying to the folder would be the conservative setting since it’s possible to recover the original files after database corruption. Is there a reason that copying into the database is preferred?
Would it be useful to have a way to select the Image/PDF import setting on-the-fly, without going into Preferences each time? Or is the assumption that once you’ve chosen a method for doing it you’ll leave it alone?
As previously mentioned, the rationale and strategy for using certain settings like this will hopefully be better documented in the future.
I realize now that I have this preference set to “Don’t copy”. This explains the behavior I noted earlier. Files disappeared from the database when I moved them to another location. I believe the Copy to Folder or Copy to Database setting would work better for me. However, I don’t want to rebuild my database manually.
Is there a way to retroactively apply this preference? If not (certainly it is not done automatically by modifying the preference), I request it as a feature. Without it, I don’t see the point in being able to change the preference after first launching the application, because this will merely result in fragmenting one’s data.
Sounds like you need the synchronization feature, which hopefully will be coming soon. I don’t know how retroactive changes could be made otherwise. And I agree with another poster above that options for importing should not be tucked away in the preferences but available on the fly, on a case-by-case basis (e.g. one may want to import PDFs but not images, etc.).