A way to do "inventory check" on index files

Not sure if this issue only occurs to just me or also to the other users who use index folders heavily:

As DT is much more than just a replacement of Finder, once macOS folders are indexed into DT groups, I will always re-arrange some items according to my info-mgt needs. This means that I might create reps or dups or simply move the items from their original index groups to other local groups (intended), or even to other index groups (usually unintended).

In theory, this shouldn’t change a thing in Finder because DT maintains a UUID pointing each item back to its location in Finder. In practice, I always have moments of panicking when I find some mismatches in the count/list of items/files between the index groups and their linking Finder folders. Typically, I will see that there are more files in the Finder folder than in the same DT3 index group. I know that those missing files definitely exist (have relocated or/and replicated in many places) somewhere in DT, but I won’t be able to trace where they are in DT - unless I search and find the name of each missing/mismatched file.

So, I am asking (1) is there a way to gather all items (just the original instance not all instances of the same item) in DT3 that belong to a specific index MacOS folder? If not, (2) will DT3 consider adding another search criteria in advance search, or some smart way, for the user to search for all index items in DT3 that have the same MacOS Finder path? The will be very useful for those who may need to re-organise their folders or to perform housecleaning in MacOS occasionally…

Thanks

Another related question:

I am probably not the worst disciplined DT user, but sometimes I’ll still delete files directly in MacOS Folders without awareness that the folders are indexed to DT or forget about the consequence of doing that. AFAIMA, this will always be leading to a verification failure in DT (missing files?). The log will generate useful info pointing to which files are missing, but it seems that the “repair” function won’t take care of this type of error by deleting those DT items that are pointing to nowhere in the Finder.
I’m just wondering if the background index sync in DT3, or the “repair” function, will be able to take care of the situation in some future updates.

Thanks again.

A Rebuild removes missing file references.

This would show indexed and duplicated/replicated items…

You could then show a Location column in the item list.

Does that help?

1 Like

Thanks for the suggestion.
(1) The issue is more for the check on a folder by folder basis … i.e. Matching what files are being in a specific MacOS folder versus where are those files in DT. For example, I index a folder that contains all downloaded articles, but segregate some index files to other DT groups accidentally (most time) by the method of moving not replication. My experience of using DT tells me that I can trust that the files will always be somewhere in DT and the file won’t be deleted in MacOS folder as long as I don’t delete the last remaining instance… But identifying which index files have been moved and where they are is another issue/challenge sometimes.
(2) You are right that rebuild will solve the issue but verification and repair won’t. I guess I am hoping to wait for shorter run-time in solving that issue…

Thanks

Isn’t this a contradiction?

This sounds similar to the situation I just ran into. My database has a number of indexed group in it, linked to folders on an external hard drive. I had to replace that hard drive (cloning the data to a new drive), and for reasons, the new drive had a new name (although the data was the same). I hoped there was a way to review each indexed group and update the folder they pointed to, but there was no way to do this.

@BLUEFROG, that smart group works, but I found a couple limitations:

  • it would be useful to be able to limit the search to indexed groups from a specific hard drive (e.g. location contains “Volume/Harddrive1”).
  • It would be useful to be able to limit the search to the main “parent” folders that are being indexed, and ignore the content within.

Which is essentially the same as specifying a MacOS Finder path as search criteria in IMHO…
This is good to have stuff but probably not pressing. Guess I’ll wait patiently…

  • it would be useful to be able to limit the search to indexed groups from a specific hard drive (e.g. location contains “Volume/Harddrive1”).
  • It would be useful to be able to limit the search to the main “parent” folders that are being indexed, and ignore the content within.

@cgrunenberg would have to assess these things.

Do note there’s also a Path column you can sort on.

Ah… this is a feasible way for checking/matching files in folder vs items in group! I overlook this column option perhaps by automatically assuming that path is the same as location (which is not).

However, I just found out that location is “good” in just showing info down-to group level but path is “not so good” in showing the (very long) file name, too. Developers might consider to modify the path info to skip the file name, much similar to the format of “Location” col in DT or the format of “Where” in the “Show Inspector” of MacOS Finder. This will dramatically reduce the col width for sorting.

by automatically assuming that path is the same as location (which is not).

Indeed they are not synonymous.

As it is with AppleScript:
Location refers to a position in the structure of the database.
Path refers to the filesystem location.

Yep, that’s what I ended up using. It’d be handy if you could do a smart group based on contents of Path as well.

I suppose what I am suggesting is this: it’d be handy to be able to work with all of the folders you indexed — not the subfolders inside, but just the main folders that I added to my database. I’d love a simple way to see all of them, see their path, and then be able to reindex them, or assign them to a new location (in the case of switching harddrives). Does that make sense?

Does that make sense?

Not. At. All. :wink: :stuck_out_tongue:
Yes, I get what you’re saying, but Criss would have to weigh in on this and the feasibility. Tighter integration is included in DT3, but more caution certainly needs to be exercised now since we’re not dealing with copies of files, but the files themselves, in a more direct manner.

We also have to be cognizant of performance. Being intimately tied to the filesystem can have unintended performance consequences.

I’ve been working on the same problem Ngan describes in the opening thread, and had already done “verify and repair.” And in the haste of doing something, anything…I did a Rebuild. My DB went from missing 6 files to having 0 missing and 5550 orphaned files. Well, I do have an inventory check now!
The named groups are not retained in the rebuild so I have an uninterrupted alphabetical list of of 3400 pdfs, with about 50 duplicates. There are also about 400 images, mostly gifs from very old websites. Those group names have lost value over time, so starting with a blank slate is not too painful (so far!).
I realize this can be viewed as an open-ended question, but I’m still interested in what you would recommend for giving structure to this collection of pdfs (without writing Applescripts)? What would you do?

Thanks!

The first thing I would do is head for my local Time Machine backups.

I have Arqbackup. Is there anything I need to know aside from just restoring the database?

Still, I’m interested in the second recommendation you would make. The situation isn’t too unlike a new comer to DTP, who could download a couple of thousand pdfs in a day or two.

Is there anything I need to know aside from just restoring the database?

Just that you need to download the dtBase2 file and all its contents.

Still, I’m interested in the second recommendation you would make.

Your current situation with all those orphaned files, is not what the original poster was describing. Check out the built-in Help > Documentation > Troubleshooting > Repairing a Defective Database for more info on Missing and Orphaned files and their causes.

The situation isn’t too unlike a new comer to DTP, who could download a couple of thousand pdfs in a day or two.

  1. Adding a large number of files would not make them orphaned, unless there was an unusual operating system or hardware issue.
  2. Adding “a couple thousand PDFs” in a day sounds like its potentially uncurated. While it’s possible it is needed, I would suggest reading this, just for your edification: DEVONtechnologies | Don't Use DEVONthink as a Junk Drawer

You may be right that I don’t get the original poster’s query. As I understand it, the problem was about an “inventory check” on index files. When index files are moved around, questions arise when there is a mismatch between the Finder and the DT3 groups after moving or replicating files around (which I understand to also refer to different databases, or even Trash). The “check” is to identify which of the possible instances has the originating link to the Finder folder.

My interest in an “inventory check” came when I learned that copies of indexed files’ metadata could found in the Trash unless Trash was emptied. I’d like to be able to identify easily which version of several is the originating link to the item in Finder

I’ve read your materials; Eric’s some years ago when it appeared, and missing and orphaned files after mine became orphans and went missing. Like much in life, the warnings are so much clearer after the problem.

Hi

My 1st post is more of an audit matter to finding out “what have I done to my MacOS files in DT”. The 2nd post is more of “if I have done something behind the back of DT, is there an easy remedy”.
Generally speaking, and after my many rebuilds due to the lack of disciplines, I find that:
(1) It’s not a bad idea to rebuild the database regularly, particularly when users have made quite a few changes, intended or not, outside DT. I rebuild my database once a year and sometimes more. It’s better to know the issues earlier than late by using rebuild. With the tight integration of link files in DT3, I find myself getting more and more comfortable to execute most of the changes within DT3 - which is a good thing (at least for me). But I still want to perform auditing regularly to reconcile the files in Finder folder and in DT groups. It is true that there might be some unexpected outcomes after rebuild but’s that’s the consequence of our own doings and we’ll soon learn from the lessons and know what external actions to avoid …! Perhaps the help document can give some heath warnings when DT3 users attempt to conduct files operation to index files outside the database environment.

(2) My suggestion would be using the log window A LOT for minor fixes. Missing files or issues related to instances in trash are always quite accurately reported by the log. After using DT actively for 3 years, now if I see verification errors (I can use DT as usual even with those errors BUT DT won’t perform full archive back-up unless verification is successful), the first thing I’ll do is to empty the DT trash (I feel safe to do it coz those files will still be in MacOS trash although the group hierarchy will be gone). If verification still fails, I will check the log report to identify which specific files are missed. 95% of the missing files issues are due to me deleting files behind the back of DT, the simple solution is to remove the related DT items. Another 4% of the missing files issues are relating to me moving the files to another MacOS folder outside DT. Depending on my needs, I will either delete the missing DT items or using Spotlight to find those files in MacOS and moving them back to the original folder to resolve the verification issue before deciding what to do next. After these two exercises, almost all missing files issues can be resolved. However, and obviously, I try to be lazy and suggest whether the verification-repair process will be kind/smart enough to help me to solve that 95 % of the issues.
Just FYI.
Disclaimer: more experienced users will likely be able to give better advice than mine.