Files lost again

Not as a regular case no, I was offering it as a debugging tool only.

OTOH I do have use cases where having a database on an external drive that is portable is important. I just have already moved all of those items out of DT as I can’t afford to risk them with what has become, for me, an unreliable place to archive data.

That’s good to know as a thumb drive isn’t appropriate for daily or long-term use with a DEVONthink database.

Portable hard drives are a much better option, but all hardware is susceptible to an eventual demise. I switched from Seagate to Western Digital externals three years ago as I had three Seagates spontaneously die of the dreaded “stuck head” issue in a very short time. Seagate’s QA seems to have suffered in the past 10 years.

But I find it exceeedingly unrealistic to claim that files lost in a single application over time are all related to a hardware failure that has never caused a problem for any other program or files.

When will DT admit that there is a serious, if rare, bug that totally undermines their entire premise for the software. That it doesn’t affect more people or files is scant comfort. You need to admit there is a problem and explain to us what you, as a company, are doing to attempt to determine the cause not just keep foisting off platitudes that are irrelevant to the situation at hand.

Yes, in this particular instance I was able to recover, what about the poor suckers who depend on your software to store critical data and have no clue that behind the scenes valuable files may go missing and the only recourse is that you’ve found it in time to recover from your hopefully sufficient backups.

1 Like

From the information you have produced, I’m not sure it’s fair to assume there is such a bug. What I’ve read leaves the possibility open that such a bug could be responsible; it doesn’t exclude other possibilities.

Again, not enough information to verify that; the probability must relate to the number of files, their size, the number of accesses etc.

I’m not saying there is no bug; I’d be somewhat surprised if it were related to the “ghost file bug” as known, because the number of people previously affected and the number now affected are disproportionate.

What am I saying? There is a lack of useful information; in my opinion both you and DT probably need to look into this. Stopgap measures can help pinpoint the problem. As such, you may be interested in a script I implemented when the original “ghost file” issue surfaced. It has, to date, not turned up any problems on my devices.

Addendum: I’m grateful to any user working toward making DT an even better product - let’s all work together with that goal in mind :slight_smile:

7 Likes

Well here’s a few quick numbers for you, my current active imported files databases have
3170 items
81333 items
5712 items
respectively.
Age of those items goes back to my first instance of DEVONThink back in 2010
Sizes of files varies widely from as small as 2 bytes to as large as 47.4 MB
Access is almost impossible to determine. I have archive files I only acess once every 5-9 years and some things I look at and work with multiple times a day.

I did look at your script but have not implemented it.

I’m giving up on DT and moving files out. I can’t risk data loss I do not catch in time to recover and the initial failure with over 500 files lost was the straw that broke this camels back. I only pointed out the most recent issue because the error or problem STILL HAPPENS!

Well, FWIW, after seeing this I realized that this has happened to me before as well. At least three or four times that I can remember since I started using DT about 1.5 years ago. In those cases I just assumed I had done something stupid and simply restored from backup and moved on (hourly backups FTW!).

And, like @OogieM, I ran verification on the databases (I do so periodically anyway) and found no issues at the time. Not sure where I would go from here!

If this should happen again, then it would be good to know…

  • whether the files are indexed or imported
  • what kind of files are affected
  • after which action you noticed the issue
  • whether the affected files were opened/edited in other apps on the Mac or on iOS

Thanks in advance!

My process is I open databases in sets and run a macro to check all the open databases, then another macro to create archives.

I do that weekly.

Every once in a while I’ll get a notification a database has a problem. So far, that has been rare and associated with a known event, like a USB thumb drive I use for syncing gone sour.

So, the question is what’s changed since last week?

Step zero, turn off synchronization.

First, open every group (option click is your friend). Click the group at the top of the file. You might want to start by not selecting tags.

Scroll to the bottom and shift-click the last document you want to include. Highlight all, in other words, excluding tags and smart groups.

Export the metadata.

Repeat on the most recent backup. Unzip it and open it in DT. Use the steps above to get a metadata dump of that database.

Compare the tables in Apple Numbers. Or, my new favorite tool, Easy Data Transform.

In five minutes, counting time to microwave some popcorn, you can get a report of what changed.

2 Likes

Thought I’d add my two cents, for whatever it’s worth. I do not claim to be an expert in software engineering, but I do know how to use sophisticated programs and databases. My work is primarily academic, heavy on the research and heavy on the writing.

I left Devonthink some time ago – perhaps a year ago when it lost a vast number of files that are indispensable. A collection that I had been adding to for perhaps 10 years. I believe it was when I was upgrading, although I can’t remember exactly. It was just gone. No sign of it anywhere in DT’s system. The main response I got from DT was that it couldn’t happen. There are a lot of things I don’t think can happen, yet they do. I have always thought I’m an excellent driver and never could cause a serious accident. And then I did. Yes, there is a difference between evidence based / assumed on assumed evidence and beliefs about one’s own talents. But there is no meaningful difference when the assumed evidence or firmly held belief simply doesn’t hold up in the real world.

I was able to recover my collection from backup. But if I can’t trust that an expensive, sophisticated app appealing to knowledge workers can’t safely store my knowledge, then it’s useless. After losing my collection, with no confidence that DT even believed me, I left DT.

The way I managed my files after that was fine. However, now a lot of the work I do involves working with the same data on my iPad and my Macbook Pro, and the way in which DT organizes and syncs information suits me well, so I started using DT again.

I’m currently upgrading my laptop and doing a massive clean up, which involves re-thinking how I organize and store my data. I’ve been procrastinating by doing rather excessive research – or at least research that takes a lot of my time. There are other ways to manage my workflow, even if DT was quite smooth for what I’ve been doing lately. If DT is acquiring a reputation of losing files-even occasionally-it’s off my list of contenders.

I was fairly dismayed and surprised when DT lost my collection of files. If this is still going on, I’ll need to rearrange one of my other options.

At the time I lost my files I got the impression that DT thought this impossible. Now I see that I was anything but unique. The work I’m doing now is possibly more important than that which I lost. And relying on a backup system to keep your main system running renders it no longer a backup but a integral part.

1 Like

I came to the same conclusion.

Now DT has implemented a way to identify the problem so you can recover from backup, or atleast I believe it will work but as you said that now makes it a part of the system not a backup.

I have moved to a different workflow, not in some ways as elegant or easy to sync across multiple devices including iOS as DT had been but far more stable for me.

I am still in the process of moving stuff out of my DT databases. I decided not to upgrade on my iOS devices to help eliminate the sync issues. Even so I have had a couple more cases where the new and improved system identified data loss. In time for me to recover but still a PITA. So I have redoubled my efforts to extract all my precious data and research out into other tools.

To resurrect an old dead horse. I just did a file verify on the Mac as I have been doing every week now on my remaining DT databases. It’s an indexed database so none of the files are actually in DT. I have 15 files that failed the new file verify test. Went to look at them on the hard drive and they are all zero length text files, zero length .png files or corrupted PDF files. I checked an older backup and they were fine on November 13. Even though I don’t use the DB much I still open it up during my normal BU session, run the file verify process and then do a backup. Except that now it won’t backup with the DT tools because the database is corrupted. Just for grins I tried the verify and repair procedure in DT. Nope, irrecoverable error. Now I haven’t lost anything, first off I have my recent backup and second it happens to be my dowloaded copy of a forked GitHub repository. I could even go back to Git Hub and get it if I had to. But I wanted everyone here to know that the DT file bug still exists and that I can now confirm that it occurs in indexed databases not just imported ones.

@cgrunenberg the post above contains answers to some of the questions you posed in August.

@OogieM thanks from me for taking the time to post; doing so helps the devs protect other users’ data. I seem to remember from earlier days that some of your databases are quite old. Is that the case for this one too? Can you remember approximately when the database was created?

OK Here’s the answers to those questions:

All indexed files

plain text files, .PNG files and several >PDF files that were corrupted and unable to open

I opened the database several times during the past week, performed a search of the files and opened several of the items that came up in the search. None of the files that were affected were ones I opened but they were in the same folder/directory as the files I did open and look at. I ran the check file integrity procedure that identified the errors at a roughly weekly normal backup task I do.

None of theaffected files were opened at all since the last time I ran check files and they were ok.

I no longer evne have DT on my iOS devices so no they were never opened on iOS either

This database was a relatively new one. It is an index of a big chunk of my hard drive. I had deleted the older version base don the idea that the age of the database and when it was created might be part of my problems so this one was created 15 October 2021. My normal operating procedure with it has been open the database, do an update indexed items, then use it, usually for a quick look up to get the location of a file I need then close it. I do my backups per above approximately every week whether or not I actually opened the database in question in the prior week.

As I have said I no longer trust DT. I wanted to point out that the problem still exists, I had even done the testing (like create a new DB) to prove that the problem is not the age of my database. I am continuing to remove all critical data from DT. I also plan to not open or use that database at all any more for any reason.

Some folks had indicated that indexed files would not be affected and since this test proved that is not the case I see DT as a loose cannon that could attack my data at any moment. I’m backing out as fast as I can.

Sure, I’m aware of that, and that’s you prerogative. I do trust DT, I’m sticking with it, and as such I’m grateful for your input which could help make the product even better. What you are describing is clearly rare; for it to be apparent in your new database smacks of terrible luck or a very unfortunate combination of factors almost unique to you. That’s likely to make bug fixing difficult, but again, without your input there would be even less hope. I wish you all the best and hope you fare better with an alternative product.

2 Likes

A failed checksum verification could merely be from indexed files that haven’t been updated.

If you look at the second message, update indexed items is the first thing I do whenever I open this dataase. That’s how I’ve always dealt with indexed file dataase, update first thing then do whatever inside DT and then close the database.

Is this database synchronized by DEVONthink? Are the indexed files located on an internal/external volume or a network volume? And finally, was anything logged to Windows > Log or did the computer or DEVONthink crash or freeze? Thank you!

Unfortunately it’s still unclear what is causing this and whether it’s always, sometimes or never DEVONthink’s fault. E.g. in case of one report last week from another user the issue was caused by a system crash which damaged several cloud-synchronized (Dropbox, Google Drive) files. And neither does DEVONthink’s synchronization ever update files in cloud folders (that’s the job of the cloud client/app) nor is it able to cause empty PDF/PNG files (as file types like PDF/PNG requiring contents are explicitly checked).

DEVONthink provides of course the tools to discover this, in probably every other app you wouldn’t even notice the issue but that does not necessarily mean that the messenger is the culprit. But it might be.

4 Likes

Yes to my own WebDav server

On an internal SSD drive that is my main hard drive on my mac
And finally, was anything logged to Windows > Log or did the computer or DEVONthink crash or freeze?
No crashes or freezes on this machine at all.
Logged as file verification failed. I did not explicitly clear the log file but when I went to grab it for you this morning it’s all empty. So I don’t have the exact messages for you. But it was only those 15 files.

The Console.log might still contain related information. Please choose Help > Report Bug while pressing the Alt modifier key and send the result to cgrunenberg - at - devon-technologies.com - thanks!

You mentioned in the MPU Forum that you also sync a SQLite database with your WebDAV Synology NAS Server and that it works. By chance do you use the same method to sync the DEVONthink database package to the WebDAV server as you use for the SQLite database files?