Strange problem with a few PDF files

I have just rebuilt a database containing some 1,200 items. When I check it after rebuilding it, I notice that 14 PDF files are lost: I get in the log the message: “pas de texte” (no text, I use DTP Office in French).

When I check each case individually: some PDFs are actually present, those are files which I imported today (and then deleted), plus one or two older documents. So the error message is apparently a mistake.

Then there are documents which have disappeared - well, been reduced to a small-size, non-readable image in the window where the document should appear. I know this is usually an error linked to a wrong import method. But I do always import in the same way, using the import button in the menu. Regarding preferences for the import, PDFs are checked. In the preferences for the PDFs, use PDF Kit is checked. So where can the error be?

Even stranger: after seeing the loss of those documents, I have used Spotlight in order to see if a duplicate of some of them could be found on my harddisk. I immediately came across one duplicate: a few days ago, I had exported one folder from my DB, because I needed to use some of the documents on another Mac, and I had not deleted the export in the meantime. Interestingly, the PDF was entirely fine in the export, and so I could reimport it as a PDF in the DB. It shows fine. And it was not a new document, but one which I had imported in the DB a year ago, and it had not been kept elsewhere on my harddisk.

So here I have a problem: if the document had been imported a year ago in a wrong way, without being kept at another place on my Mac, how can it be that I could export it fine just a few days ago along with other documents?

If I had not that “proof”, I would have thought it was an import problem on my side. But having discovered that, I can only come to one likely suspicion: rebuilding the database might in some case damage some documents. True or wrong? Or what should I be doing differently?

Thank you for any information on that issue!

PS: a few minutes before rebuilding the DB, I had saved it on an external disk (using the Finder). It just came to my mind to check it now: well, the missing PDFs are all fine on the saved version, before rebuilding. It is true, however, that they all have next to them the small sign - I don’t know how you call it, you know, that ~ in a small blue circle. I think this means something important related to the import, but what? Anyway, explanations in order to prevent that problem would be greatly appreciated!

The blue circle with a white ‘lightning bolt’ inside it indicates that the PDF had been Index-captured rather than Import-captured.

So if the original, externally-linked PDF were subsequently deleted in the Finder, the PDF in your database would have lost information and would fail during a Rebuild of the database.

You cited the fact that an old export of the group containing the PDF in your database resulted in saving a complete PDF file to the Finder as proof that it must have been originally Import-captured, rather than Index-captured.

But that’s not proof, as an export will result in a complete copy of the PDF file in the Finder, even if it had been Index-captured, so long as the externally-linked PDF remains in the Finder.

So my conclusion is that the PDF failed in the Rebuild because it had been Index-captured, and the externally linked PDF had subsequently been deleted.

Rebuilding a database is rarely necessary. In general, if I were to run Tools > Verify & Repair and receive a report of uncorrectable errors, I would move to a backup of the database. Tools > Restore Backup will allow one to restore the state of the database as of the most recent backup. If that backup is bad, one would move to the next oldest internal backup.

I recommend use of Scripts > Export > Backup Archive when one had made substantial changes to a database. This can be set up in seconds just before one takes a break. Backup Archive will verify and optimize the database, then make current internal and external backups. When one returns after the break, the database is ready to go, with current backups.

Thank you for your remarks and advice.

You are probably right. But isn’t it strange that the copy of the DB copied on an external disk still shows all the “missing” PDFs - despite the fact that they show the lightning bold and are neither on the external disk or on the hard disk? If they would only be a linked file, this couldn’t happen, right?

So there is a small mystery here for me. And we speak about documents which have been for months in the DB, have been opened and consulted several times since, and the entire DB has gone several times through verify and repair, while the original imports had been deleted months ago and were nowhere to be found in the Finder (I can tell you with certainty that, in most cases, the assumedly “linked” files were definitely no longer in the Finder since a long time).

I would be curious to know why PDFs are sometimes index-captured instead of import-captured? What should I change in my settings in order to prevent that? I am only interested in import-capture.

Interestingly, going through my files. I have also discovered several additional PDFs with the same problem, but not listed in the error log. - No need to say I will carefully keep my copy of the DB on the external disk (glad I copied it just before rebuilding!), and recover documents there if I come across some additional missing ones.