Considering making my entire DT database simply indexed in native filesystem. Thoughts?

Hi. I’m a happy long-time but probably still simpleminded user of DT. :slight_smile:

I typically have a dozen active professional and personal projects on the go and each has a hierarchy of notes, paper scans, lots of flat text files (e.g. data & output), screenshots, reports, correspondence, and often source code if it’s a software development project. I view everything as equally important and that’s where DT hasn’t been a perfect fit.

I enjoy the scan/OCR/search benefits of DT greatly. But I wrestle with its internal filesystem storage because it prevents me from easily working a the commandline for edits (I use Vim a lot for meeting notes, source code, etc.), running commandline tools on data files for analysis, grepping and filtering output, and generally playing nicely with other apps and utilities. One big problem is sibling documents in a DT group are not generally peers in a filesystem folder.

So … I’m moving to a new primary computer anyways and considering moving my DT database over to it and then exporting everything, removing the DEVONtech_storage noise that appears in every folder upon export (safe?), and then simply bringing everything “back in” to DT by indexing everything in a fresh empty database going forward.

At first glance, this seems to give me benefits of DT’s scan & search UI but then I’d finally a meaningful filesystem structure is available to me from the command line, the Finder, and all other applications.

I understand this would be automatically re-indexed from time to time, which should be sufficient. FWIW, the loss of replicant/duplicate features of DT seems minimal, based on my past usage.

Does anyone relate to this desire for an outside-DT filesystem while still enjoying DT’s front end for some things?

Thanks for reading my long braindump. Appreciate any insights. Happy New Year!

I would suggest it only if you are uber-aware of the potential pitfalls of indexing and how you can inadvertently lose files in this way.

It can be done but there are potential gotchas, along with the cognitive effort to be aware of those gotchas if you use DT3 in this way with mission-critical files.

1 Like

You can probably use vim to open all these files directly from DT (context menu, open with). As to grepping etc… you could write a simple script that permits you to enter a shell command, runs it on the selected file and displays the output (or sends it to the terminals).

only if you are uber-aware of the potential pitfalls of indexing and how you can inadvertently lose files in this way

Yikes. What have I been missing in the documentation about indexing and file loss? What would be an example chain of events that would lead to file loss? Thank you for the warning, rkaplan!

Thanks, chrillek. That’s a thoughtful response and you’re right it could work for one-off shell commands or edit sessions.

But terminal sessions would be impossible because simple things like cd or make wouldn’t even work – the underlying folder structure of a project is scrambled when imported into DT as a group or nested groups.

As far as I can tell, a hierarchy of inter-related docs and source and notes and datafiles cannot live in DT if they need to refer to each other (e.g. ../data/session.csv or even just ./otherfile.txt)

So if I’ve got spreadsheets, notes, presentations, code, data, output, scripts, etc. that all live logically together and depend on each other it seems I cannot import them “into” DT so my only option may be to index them but then I wonder if that’s such a “degenerate” or risky use of DT that maybe it’s not worth it at all. Cheers.

You can read a number of stories here

https://discourse.devontechnologies.com/search?q=indexed%20files%20missing

What would be an example chain of events that would lead to file loss? Thank you for the warning, rkaplan!

Typically this happens when files or a folder are renamed, moved, and or placed in the trash. The files appear to be present in your indexed DT3 directory so you do not realize the issue right away, but when you go to access the file you may get an error saying the file is missing.

Notes, presentations, and script. generally happily use x-devonthink links

Perhaps terminal-level dev tools are different

Wow, okay, thank you rkaplan. Nothing in the user guides gave me the impression that indexing could be as scary as those threads you shared. I can accept delayed or flawed indexing (i.e. resulting in imperfect searches but with actual files still intact) but I cannot accept lost or misnamed files. I will read those stories in more depth but sincere thanks for pointing me there.

You should read the Help > Documentation > In & Out > Importing & Indexing section. All of it.

1 Like

Indexing files is perfectly safe, and predictable, if you understand the ins and outs as pointed out by @BLUEFROG. I have been indexing 100% of the files in all my databases for years now, and I’ve never encountered a problem with lost or misnamed files due to indexing.

5 Likes

Hi @Greg_Jones

What you say is exactly right - with the important caveat that you, Bluefrog, and I have all noted - Read and understand all of the documentation on indexing.

The OP suggested in his original post that [my paraphrasing] he is a basic user of DT who is very busy with lots of professional activities. I am a former coder [many years ago] and currently use DT for busy non-tech professional work - and when I tried indexing I missed some of the nuances, just like quite a number of other users who have posted about lost files.

It is indeed documented very well. If you follow the instructions exactly you will not have a problem. So everyone needs to decide if they indeed have that level of discipline to read everything on the topic and consistently follow it.

Bottom line - most of DT3 is a rock-solid bulletproof database. I have never lost any data except when I tried to index. DT3 has saved my bacon a few times with some of its features that help me to keep my data. But indexing is not in that category. It’s the one area of DT3 I think all but the uber-technically-competent should stay away from with important data.

1 Like

Well, the first item in the link that you posted is from a user that deleted files in the Finder that were indexed in the database, then couldn’t understand why the files were no longer in the database. I don’t know how uber-technically-competent one needs to be to understand why that happened? There are many ways that the actions of the user can corrupt databases and/or lose data, and it behoves the user to explore how that might happen. Indexing documents doesn’t Inherently put the user’s data at risk.

3 Likes

I can see how that seems like a really amateurish mistake. But it could also be more nuanced.

I have certainly deleted files or even entire folders inadvertently at times. Usually I realize the problem soon enough when I notice that an important file is not there, and there is enough time for me to retrieve it from the Trash or elsewhere.

One issue with indexing however is that a file can be deleted from the Finder but still appear to be present in your daily DT3 user interface. When a user (or a wayward app) deletes a file in Finder, the corresponding indexed Group in DT may not immediately reflect that the file was deleted in Finder. So the busy user might understandably not recognize the file deletion until a time when he is out of town or has a critical need for the file or has already emptied the trash etc.

This is possibly the most common gotcha situation with indexed files but not the only one.

Don’t get me wrong- I think indexing has its place. It is particularly helpful as a temporary means of transfer to get files into DT3 - especially via Dropbox. And overall DT3 is maybe the most solid database I have ever used - it has let me down less often for sure than SQL databases that I have used. But I do stay away from indexing of mission-critical files.

1 Like

Indexing should only be used if necessary, I think - and then there’s one simple rule: Don’t touch the files in Finder. That’s it.

Except maybe you should touch the Finder when you add files. Otherwise you can wind up with a folder which partially resides inside DT3 and partially resides in your Finder. Then if you restore one but not both from a backup, or if you move files from there elsewhere, or if a year from now you do not quite remember all these details, or some new app does something funky to your file system…

The linked thread is 10 years old and the user didn’t understand how indexing worked back then:

Not sure how this is related to indexing in DEVONthink 3 tbh.

1 Like

I believe the same situation can happen with DT3 now

1 Like

Indexing documents in DEVONthink 3 is very different than indexing documents in any prior version of DEVONthink. The integration with the Finder is much tighter now, to the point that one could say if you lose indexed files in DEVONthink today, then you would have lost them in the Finder anyway.

2 Likes

rkaplan, your characterization of me is correct … despite being very technical, I have enjoyed being a very basic DT3 user of five years. So far, I’ve not had to think about file integrity what-ifs in my current DT workflow because all I do is scan and move and search, all within the DT database and not indexed. Nothing in my DT DB changes outside of my actions within DT. Projects or partial projects that involve frequent structure (folder/group) or file changes such as fresh data files or generated output or code … I leave those out of DT. Unfortunately, this leaves me with two places to search and modify throughout the day (both DT & filesystem) but it’s on a per-project basis so not complex as it might be with a “hybrid” of groups plus indexed folders within a given project in DT.

Greg_Jones:
Indexing documents in DEVONthink 3 is very different than indexing documents in any prior version of DEVONthink. The integration with the Finder is much tighter now, to the point that one could say if you lose indexed files in DEVONthink today, then you would have lost them in the Finder anyway.

Thanks. This is how I thought it would work. If I lose a file in the OS filesystem then I don’t expect DT to protect me from that. But if I move it around in the filesystem anywhere within a large number of folders that are all indexed somewhere in DT then I’d hope it would be tracked/found/indexed by DT soon, if not immediately. Ditto for new additions or renamings anywhere nested within folders that are indexed.

I think the best thing for me to do is index something complex and try some edge cases and see what happens.

I just set up a brand new indexed folder in DT3, did some file additions and deletions which could happen in the real world, and I got the famed File Missing error in DT3:

If you regularly use DT3 and didn’t happen to highlight this file, you would think it is there and would never realize it is missing.

In the Finder however it is clear that it is missing:

Where is the file by the way? It is in the Finder Trash. If the user in this situation empties the Finder trash before realizing the indexed file is missing, then the file will be truly gone.