File content disappears within DEVONthink (very serious)

My settings:

  • MacBook Pro 17" running Mac OS 10.6.5
  • DEVONthink Pro 2.0.6

I’m using DEVONthink Pro now since a week or two and run into a very serious problem: I imported documents or created webarchives or texts within DEVONthink. They have been displayed correctly.

However, it happened several times with different document types (text, PDF, webarchives, Excel sheet) that the content of some documents (i.e. files) within DEVONthink disappears. The view window is empty. All metadata are preserved and even the Finder shows the correct file size. However, the file is empty. (I tested it with the terminal command “cat”.)

In the past I had some DEVONthink crashes (no idea what caused them). Thus, I thought there may be correlation. Yesterday, however, I created two webarchives, quit DEVONthink without errors and found out today that the content of both files disappeared.

I checked the database for errors; DEVONthink did not find any.

This is an important issue for me. I must be sure that content does not disappear, otherwise DEVONthink is useless. Two questions:

(1) [b]What may cause this problem?[/b] (I thought that Clusters <http://latenitesoft.com/clusters/> may cause the issue, but I never experienced this problem with any other file outside DEVONthink.)

(2) [b]How can I find out which documents in DEVONthink does not have any content[/b] (even though the metadata shows something different)? I don’t want to browse through thousands of files manually.

Thanks for any quick feedback.

I hadn’t been aware of file compression utilities such as Cluster, since the old Classic days on Macs – when the capacity of hard drives might be rated as 40 MB, or even 20 MB. Back in those days there were several utilities available to squeeze more data onto a hard drive.

Sometimes they worked without obvious problems, and sometimes their use resulted in hosed data.

I’m not criticizing Cluster specifically, but I’m inherently suspicious of any utility that deliberately modifies OS X, and this one does. Apparently, Cluster can cause problems with some files, as the developer notes that it is “configurable for your safety”. Personally, after years of experience in software support during which I’ve seen problems caused by even very popular utilities, I stick to a pretty stock OS X on my own computers and I don’t have stability problems or data loss.

Did Cluster cause your problems? I don’t know.

You mentioned that you have had DT crashes in the past. Although most crashes are fairly graceful in OS X, any crash (System or application crash, Force Quit, power outage or forced shutdown) results in improper closing of open databases and potentially could result in an incomplete or damaged database, so I take them seriously. They should be extremely rare.

One of my computers that holds “master” copies of my databases hasn’t had a crash in more than two years of intensive work. It’s gone through a series of OS X upgrades and updates and updates of DT Pro Office flawlessly. When running Verify & Repair, I’ve never seen an error report on these databases. I’ve never seen a “missing file” report or other problem. That’s why I can say that my databases are rock solid stable.

Very often I’m working with prerelease betas of DEVONthink applications, and I generally do that on a different Mac, with copies of my databases. Even on that Mac, I’ve had only 5 crashes in more than a year, two of them resulting from the same situation involving import of an old WebArchive file that Snow Leopard doesn’t like. This is the Mac that I use when I ask a user to send a copy of a file or database that has caused problems on the user’s computer. So I have to say that even my “beta test” Mac is very stable.

Perhaps your database still holds the files that it reports are missing, but has lost track of them because of damage.

SUGGESTION:

  1. Quit the DEVONthink application. In the Finder, select the database file, Control-click and choose the option to make a compressed (zipped) copy of the file. Reserve that copy for possible future use in data recovery.

  2. Launch the DT application and choose Tools > Rebuild Database.

  3. When the rebuild operation is complete, check the Log (Window > Log) to see if there’s a list of files that failed to be included in the rebuilt database. If so, Save the list as a reference for attempts to find those files.

  4. Now run Tools Verify & Repair. No errors should be reported. Now look for a special group name “Orphans”. Fingers crossed, your files that DT had “lost” will be included there, and you can now file them properly in your groups.

If problems remain, send a Support message citing this thread and the results of your rebuild attempt and we can try other approaches to recover data.

Exactly same problem here. No idea how this could happen. Tried to recover the way you suggested. Around 130 missing files were detected but they couldn’t be restored. Spotlight couldn’t find them either.

I’ll be able to find them via my backup at Jungledisk, but nevertheless: How can we prevent this from happening again? I, too, had some forced quits / crashes recently, maybe therefore? (I don’t use Cluster.)

I think I know what happened. The files that disappaered belonged to two folsders that I was draging to another database. Even after more than 15 minutes, the progress wasn’t finished, DTP was not responding, so I forced quit the app.

I’ve realised it before: Draging and droping is much slower and is much likely to crash than moving the files via the context menu.

Are your databases very large? (Or is one of them? or do you have a lot of them open?) I have experienced this kind of crash myself dragging between two databases, and devonthink losing the files. It apparently happens because the Devonthink app is hitting up against the 32-bit address space limit, which is somewhere around 3gb of virtual ram (I think it’s actually 2.83gb).

You might check your console logs and see if there are reports of the FSEvent dying or devonthink unable to allocate more memory (references to malloc). If you see, those, it’s worth reporting the bug, which I think is at its worse in database-to-database moves of multiple files initiated through drag-and-drop. But that part is just impressionistic on my part.

I’m definitely in the waiting-for-64bit club for Devonthink. I use it for everything, and have 3 databases with more than 50gb of text in them. I don’t even try moving files between them…they go to the finder first until DT gets a new kernel.

Oh, on the issue of how to find files what files are missing, there is no perfect way. If you do a verify and repair, a list of missing files should come up in your log. Or you can do a search for all files (try a) and sort files by size and by path (you might have to make those columns appear by using view->column–> size/path) . Look for files of 0 file size or no path.

Many times, but not always, if the crash occurred in moving between two databases, the files are Orphans either in in one of the two databases. If devonthink is reporting orphans but not importing them, you might look in the Orphans folder inside the Devonthink package. I was able on one occassion to hand copy files from there back into the database.

All this advice is based on the premise that you are having the same problem I have had, which is that you’ve built databases that are collectively too large for Devonthink with its current 32bit kernel. You’ll need to check the console logs and watch Activity Monitor’s use of vram to know for sure. The workaround for now is to build smaller databases and not load as many at the same time.

Of course clusters might be a completely different source of the problem.

-erico