Hundreds of files missing after 2.0 upgrade

I had a database of over 6,000 documents in DevonThink Pro Office 1.1.3. On upgrading to the 2.0 beta (2.0pb7, Mac OS X 10.5.8) I’m about 2,000 documents short. Unfortunately I didn’t discover this until after I’d added several hundred more documents to the new database, so I can’t just revert.

A visual inspection of the two databases side-by-side shows no real pattern. Some whole folders are missing; most (but not all) email threads are gone; and some random PDFs didn’t come across. (Some of them have entries, but not backing documents, in the original db; I found the original documents in the Files folder in the database bundle).

Is there a mechanical way to identify the files that are in the old database but not the new one and reimport them, or am I stuck with a couple days of going through folder by folder and looking at the differences? (I did an Export > Listing… of each and did a diff, that’s a big help, but it’s still hundreds of individual entries to visually compare, find, and copy across).

Alternatively, is there a way to reexport everything from the old db and reimport it into the new one, but exclude duplicates? (I suspect that there was some database corruption in the original that caused the export to fail, so that may not be helpful.)

You could use FileMerge, which is installed as part of the developer tools (which can be installed from the original OS X disks), as follows: export both databases (say, the original under “folder_original” and the new under “folder_new”), then use filemerge to compare the two folders (under each of which the old and new databases have been exported). You can then select all the entries on the leftmost pane of filemerge and merge them into a new folder (click onto “merge” and then “combine files”). then import that…

There’s probably other ways to do it too, and this may not be exactly what you want, but anyway this is a start and will do most things automatically.

You could use filemerge to do this too (check the first box, next to “identical”, in the filemerge dialog after the comparison is done, then “Merge->Combine files” after having selected everything will create a folder containing the differing files only; import that).

Hope that works.

Depending on conditions of the database, another possibility might be to use a “Date Added” column (e.g. temporarily added to the History window) and/or a Smart Group to help locate and segregate the newer/older documents.

Has been anything logged during the conversion? Did the old database contain sheets? Each sheet is now one document whereas in version 1.x each sheet contained lots of records and therefore this could explain the difference.

The conversion was several weeks ago, and there was some logging about files that could not be converted (but not thousands).

I tried various diffing mechanisms. File diffing fails because the organization in 1.3.1 is flat and in 2.0 it is in subfolders by type.

I kind of like the idea of exporting the docs added since the import, then reconverting the original db and then reimporting the new files. But if there’s something fundamentally wrong with the conversion I might end up right where I started.

Just convert the database again, check the Database Properties before/after converting and have a look at the log. If everything’s fine, add the exported docs. Otherwise we’ll have to have a closer look.

Which is why the suggestion was to export the contents of the two databases (old and new), thus recreating the database structure as folders in the filesystem, then do the diff between those two.