A. I’ve noticed that if you re-locate a file from one path to another in the filesystem, DT will still update its path. It seems this was not always the case (and I’m not certain it is always the case now). Under what circumstances does this occur? What limitations are there?
B. Does “Synchronize” applied to a group check that group and all records inside it against files at corresponding paths in the filesystem, as well as pulling in the contents of any directory? That’s what I thought. However, It seems to me that it does something more complicated, and this leads to some irksome behavior. It is possible for the path associated with a record to correctly correspond to a file in the filesystem, and yet to have that record deleted, or duplicated to another location within the database.
The steps I give are going to sound complicated, but if you dump projects into DT and later reorganize them, I don’t think this example is implausible relative to the sorts of things you might do occasionally:
1) index a path "My Old Project" containing a file "My Old Project/Work Group Phone Numbers.txt",
2) index a second path "My New Project"
3) index a third path "My Downloads" (say a temporary repository for downloaded files)
4) You decide you want to delete "My Old Project" but you save some useful files from it. In the file system, you move "Work Group Phone Numbers.txt" to a directory "Frequently Used Project Files". Within DT, you move "Work Group Phone Numbers.txt" to the group "My New Project". You then delete "My Old Project" from DT and from the file system.
5) You create a new text record, "Strategic objectives.txt" in DT inside "My New Project"
6) You download a PDF, "Thompson CV.pdf" to "My Downloads" with your web browser
7) You then synchronize "My Downloads" and that pulls in "Thompson CV.pdf". You move them to "My New Project" where they belong.
8) You save an e-mail attachment "Some meeting notes.txt" to the folder "My New Project" in the filesystem
9) The group "My New Project" now contains 3 items:
- "Work Group Phone Numbers.txt"
- "Thompson CV.pdf"
- "Strategic objectives.txt"
"Work Group Phone Numbers.txt" is indexed. Its path will indicate "Frequently Used Project Files/Work Group Phone Numbers.txt" and that is in fact where the file is located.
"Thompson CV.pdf" is indexed. Its path will indicate "My Downloads", and that is in fact where the file is located.
"Strategic objectives.txt" is not indexed, but only exists in the database.
A fourth file, "Some meeting notes.txt" is located inside the folder "My New Project" in the filesystem.
You then synchronize the group "My New Project" to pull in "Some meeting notes.txt"
- "Work Group Phone Numbers.txt" - VANISHES, even though its path was to an existing file ("Frequently Used Project Files/Work Group Phone Numbers.txt" )
- "Thompson CV.pdf" - remains in "My New Project"; its path is also to an existing file ("My Downloads/Thompson CV.pdf")
- "Strategic objectives.txt" - remains in "My New Project"
- "Some meeting notes.txt" - appears in "My New Project"
In a slightly different scenario, if "Frequently Used Project Files" were indexed at the outset and then synchronized at step 9, then "Work Group Phone Numbers.txt" would be duplicated to that group. It would again disappear from "My New Project" when that is synchronized.
This suggests that it does not suffice for a record to have a path that refers to an existing file for it to be preserved during synchronization. Instead, some other criterion must be satisfied. From the perspective of trying to organize a large project, this is troublesome. Can someone illuminate this situation?
The simplest response is that there is only one-way synchronization from the Finder to corresponding indexed groups and files inside your database. DT doesn’t “talk back” to the Finder.
Suppose you’ve got two folders, Bob and Harry. You Index both folders to your database. There’s a file named Jim inside the Bob folder. Now you move Jim from Bob to Harry. Synchronize the corresponding groups. Now you have duplicates of Jim. Removing Jim from Bob didn’t delete Jim in the Bob group in your database. But adding Jim to Harry added Jim to the Harry group in your database after choosing Synchronize.
Check it out. Do both copies of Jim have the same Path? What will happen if you select either copy of Jim and invoke Launch Path, then modify the external copy of Jim, then synchronize both copies? What would happen if you synchronized only one copy?
As my databases are Imported (self-contained) I don’t worry about Finder organization at all. All my organization is done in the database itself. I think I would operate the same way if I Indexed content into my database.
I understand and expect that synchronization is one-way; that doesn’t really explain what I described above.
Actually, that’s not true. Moving the file Jim from Bob to Harry leads the record named Jim to be deleted from Bob, and added to Harry. Getting into the details: when I synchronize the new location, it adds the duplicate. When I synchronize the old location, it deletes the original. It doesn’t matter what order I do the synchronization.
All of that seems very reasonable from a usability standpoint. The example above was to illustrate a case where it was very hard to understand what was happening, and a record could vanish from a large project even though what the user did seemed reasonable.
However, now I see that the exact same behavior is occurring in your much simpler example: Since the path on the record “Jim.txt” is immediately updated in DT when I move the corresponding file in the Finder, it can’t be that synchronization is asking whether each record has a valid path, and then updating the record. Instead, perhaps, it is comparing the path of the record to the path of its parent record. Is there any documentation that explains this process?
I suggested this, but now I see that it can’t be that simple either. The record “Thompson CV.pdf” in the example I originally proposed has the path “My Downloads/Thompson CV.pdf” but when I move it to the group “My New Project” that has path “My New Project/”, and synchronize “My New Project”, this does not delete the record even though its path does not match that of its parent group.
Remove the files/folders not existing at the original (!) location anymore
Add files/folders which are not yet part of the database
Moving a file from A to B causes DEVONthink to remove the file from group A and to add it to group B (see Tools > Log). Same applies to renamed files.
But the path shown in the Info panel might be a different “beast”. DT’s database contains aliases too and those are used to find renamed/moved files/folders while using the application but not (!) for synchronizing (as this causes lots of problems otherwise, e.g. unexpected duplicates).
Christian, thanks, I think this clarifies things a great deal.
The apparent contradiction in the behavior of “synchronize” stemmed from my observations of the path shown in the Info panel, and the path information returned by Applescript for the “path” property of the record. I’ve confirmed that both change in response to moving the original file. And in general this is a very useful feature–but it was leading to synchronization behavior that didn’t make sense.
However, this leads to a (I hope) small request, which is that for programming purposes, there be some way in Applescript to get the path that is used for synchronization purposes, or some indication of whether the file has been moved. I guess the ideal would be to make available in Applescript an alias property for the record (perhaps also a Cocoa representation of which appears in the Info window) which is the one used for computing the current path property, and a separate original path property which is the one used for synchronization. I managed to work around this in my own setup, but I can see others going down this path and running into problems.