2.0b8 being very slow at viewing PDFs

jwiegley · January 12, 2010, 12:09am

I have a “financial documents” database which is kept within an encrypted Sparsebundle. I’m using 2.0b8 to view documents in this database, which is 7.3G in size, but only 5.7 million total words and 244K unique words. My machine is an 8-core Mac Pro 2009 with 24G of RAM and a very fast RAID-10 array.

I find that when I click on a PDF to view it, it can take about 5-10 seconds before any text appears. These are scanned PDFs, none larger than 3M, most less than 1M. They contain both images and text (Acrobat did the OCR work).

After the page loads, it takes almost as much time whenever I touch the scroll bar. That is, even time I want to scroll down, I wait around 3-5 seconds. Even to scroll back up to see text it had already rendered.

I’m wondering why, on a machine that I really can’t be any faster, it would be this bad. Even on my old laptop, I don’t remember PDF browsing to ever be this laggy. It makes classifies a large group of documents horrendously slow.

korm · January 12, 2010, 10:14am

PDF viewing over here with a similar physical machine and DTPO 2pb8 is very fast. Most of my data is in large databases with long scanned PDFs.

Do you suppose the encrypted sparsebundle might be adding overhead on the OS X side? What happens if you copy a group of PDFs from this slow database to a new database that doesn’t reside in such a bundle?

jwiegley · January 12, 2010, 10:39am

I’ll certainly try that to narrow it down, but given the hardware here, the OS could cache the entire 7.3G in memory if it had to.

John

korm · January 12, 2010, 11:55am

I tested an encrypted sparse bundle with a sample of PDFs that fit a profile similar to what you describe. I’m getting good results with DTPO opening/viewing these documents from that bundle.

Not that DTPO isn’t the culprit, could there be something about the settings in Acrobat (high bit depth, low compression, raster vs. vector images in the scans, etc.)? OTOH, if the documents open quickly in Preview, it is probably not Acrobat.

Bill_DeVille · January 12, 2010, 4:41pm

I’m drawing a blank, as well. I tested a similar database set up as a test, with 1,903 PDFs. Many of them were considerably larger files than yours. Some were scans OCRd into DTPO (version 1.x and 2.0).

Even with the Classify/See Also sidebar in use, PDFs pop up virtually instantly as they are selected, except for 1 very large book (430 MB) that takes a couple of seconds to display.

I’m running an iMac quad-core i7 with 8 GB RAM.

Acrobat OCR had not been used on any of my PDFs.

jwiegley · January 12, 2010, 9:56pm

I copied the database to an unencrypted SSD, to remove any I/O concerns.

I am able to count “one one thousand”, etc., up to four before I see anything when viewing a 733K PDF. This is typical of all the scanned PDFs I have.

korm · January 13, 2010, 9:08am

Aggravating, isn’t it? Do the PDFs open reasonably fast in Preview; Acrobat Reader; Skim; etc.? Have you filed a bug report with DTech and send them a sample PDF?

Another possibility – do you have “Enable plugins” checked in the Web tab of the preference pane? Over here I’ve had endless problems with DTPO refusing to play well with the Acrobat plugin - even in cases where it is not a web page I’m viewing. Not everyone has the same problem, though.

alanterra · January 14, 2010, 7:41pm

I just want to enter my standard culprit for slow responses—Apple’s memory management. I don’t understand this (and logic says that what I see shouldn’t occur), but…

I keep the System Memory panel of Activity Monitor open, and whenever the blue threatens to fill up the pie chart, my mac slows to a crawl. I just experienced this recently in DT, but it is a general phenomenon. Many times, quitting programs will solve this, but sometimes I have only Finder running and I still have 12GB or more of “blue” (on a 14GB machine). In this case I reboot. Programs that index large sets of files or that use large files (Bridge, DT, FoxTrot, Photoshop) all seem to cause this problem. As I said, sometimes, but not always, quitting these programs returns the memory to green.

A

Bill_DeVille · January 14, 2010, 8:26pm

alanterra:

I just want to enter my standard culprit for slow responses—Apple’s memory management. I don’t understand this (and logic says that what I see shouldn’t occur), but…

I keep the System Memory panel of Activity Monitor open, and whenever the blue threatens to fill up the pie chart, my mac slows to a crawl. I just experienced this recently in DT, but it is a general phenomenon. Many times, quitting programs will solve this, but sometimes I have only Finder running and I still have 12GB or more of “blue” (on a 14GB machine). In this case I reboot. Programs that index large sets of files or that use large files (Bridge, DT, FoxTrot, Photoshop) all seem to cause this problem. As I said, sometimes, but not always, quitting these programs returns the memory to green.

I monitor the amount of free physical RAM and over time will see it slowly decreasing. I usually have 5 open DT Pro Office databases and several other applications running.With 8 GB RAM, I’ve got more than 3/4 of the installed RAM free after a restart and with that set of applications and databases open. Data cached in RAM isn’t always released after a procedure is completed, or an application is closed.

Eventually, the computer can run out of free physical RAM and move into Apple’s Virtual Memory (VM) mode. VM will allow memory-intensive procedures to run to completion by swapping data back and forth between RAM and disk, but as disk read/write operations are orders of magnitude slower than RAM read/write, the price is a slowdown.

I’m spoiled. I like my computer to be responsive, with very fast DEVONthink searches, See Also and Classify suggestions. So when I see free RAM beginning to get tight, it’s time for a restart. I’ve never tried to set a record for operating time between restarts, so I’ve developed the habit of launching C ocktail weekly to run a suite of routine OS X maintenance operations. As C ocktail requires a restart, that clears RAM and the VM swap files and I’ve avoided the aggravation of seeing spinning balls.

I’ve got Acrobat installed. But I’ve always been suspicious of the Adobe PDF viewer plugin that’s installed whenever Acrobat or Adobe Reader is installed or updated. I immediately remove that plugin, as it’s not necessary to view PDFs under OS X.

jwiegley · January 14, 2010, 10:15pm

It would seem that I experience the same slowness in Skim, so it would appear this is not DT’s fault.

acl · February 17, 2010, 3:36am

If you are still watching this thread, try opening the PDF in Preview and saving it again. In my case, this solves the problem (when it occurs). This can be scripted if you need to do it for lots of documents (and if it actually works in your case).

cgrunenberg · February 17, 2010, 8:14am

On Snow Leopard, data detectors are enabled by default and can slow down opening of PDF documents a lot. The final release will disable them.

alanshutko · February 18, 2010, 5:43pm

Christian, will there be an (even hidden) option to turn them back on? I like data detectors in general (although, I’ll be honest, I’ve never used them in a PDF).

jwiegley · February 18, 2010, 10:48pm

Just to add a bit of info: Yes, opening and re-saving with Preview solves the speed problem, but it triples the size of the PDF as well.

John

cgrunenberg · February 19, 2010, 10:10am

There will be an option to enable them but they will be only disabled by default for PDF documents.

jwiegley · March 11, 2010, 8:27am

In 2.0.1, the problems still exists. In fact, I’m looking at a 601K PDF right now that is SO slow, that every time I scroll down with my scrollwheel, it takes almost 6 seconds for every inch it reveals.

Why does opening and re-saving in Preview makes this problem disappear?

John

jwiegley · March 11, 2010, 8:33am

This discussion seems to be my problem exactly:

The problem with the document in question is that it is actually a highly compressed PDF/A document-- not a true PDF.

A little sleuthing revealed it was created with LuraDocument PDF v2.28 emulating Acrbat version 1.5 (6.x). LuraDocument is a pricy Windows enterprise-server solution for generating highly compressed PDF/A (or â€˜PDF archival") documents. A discussion about PDF/A documents can be found at:

infonomics-digital.com/infon … 506/?pg=61

Improved PDF/A support was introduced in Acrobat Professional 8. In addition to improved compliance testing, Acrobat 8.x and above corrects some of the simpler errors that it finds. Also, Acrobat 8 was the first product to support the even newer PDF/A-1a format.

Your Mac has trouble with these files because it is converting the embedded postscript in them on the fly to PDF. For instance, a PDF/A file can contain an embedded Flash file which is something Preview’s underlying graphics engine was not designed to handle.

Basically when you save the file down in a Mac, the PDF/A image compression is removed sacrificing file size for speed and ease of handling.

Because I am indeed scanning these files to PDF/A using Adobe Acrobat. And in Acrobat there is no speed penalty.

John

jwiegley · March 15, 2010, 8:35pm

I’ve learned a bit more: It’s not PDF/A which makes the PDFs slow. It’s the choice of “Searchable Image” in Acrobat’s OCR scanner, rather than “Searchable Image (Exact)”. If I choose the latter the file sizes are still pretty much the same, but viewing in DEVONthink and Preview is nearly instant.

John