I have two groups in an database of DTPO which partially contain groups with identical names, but as well different groups and items. One of these two groups resides only in the database, the other one is an indexed directory in the Finder/file system of OSX. Both groups contain indexed items (folders and files) in the Finder/file system of OSX, but as well groups and items which only reside in the database.
I want to join these two groups with all of their content.
In the result I want to have just only one group/indexed directory containing all items of the current both groups, but without any duplicates. And all items (files and folders) should reside in the Finder/file system of OSX and should be indexed in this only group in the database of DTPO.
How can I achieve it?
Thanks in advance! Kind regards, Friedrich
IMO, best way is to index the target (end-state) group, manually move things into it, confirm there are no duplicates (or delete those that exist), then select everything and use “Move to External Folder” in the contextual menu. If you’re looking for automation to make this complex task easier, you might not find it – though a possible helper is the “Merge” command which operates on documents and on groups. Play with it on some test data, first.
Hmm. There are appr. 500 groups/directories in each of both groups and more than 5’000 items in each group.
Well, that tidbit would be hard to suss out from the OP –
So, you want some sort of automation to de-duplicate and merge 500x5000 items into a new indexed group? Seems like the main factors here are “massive problem” and “deduplication”. DEVONthink has no deduplication feature – it can point out what “thinks” are duplicates, but in my experience that is trustworthy maybe 60% of the time. But, on the other hand, I wouldn’t want to trust any automation at this price point with a task involving deduplicating a few million records.
The suggestion of working out a manual process stands – if it’s important to continue with DEVONthink for your record management.
No, no. There are appr. 500 folders in each of both groups, and in each of those two groups 5’000 items. On average ten items in each of the subfolders/groups.
So I think I could create a new folder in the OSX file system, index it as a new group in DTPO, move all items of the non-indexed group into the new group, chose all the moved items an move them into the file system. And then there should be a (Automator?) approach to merge the existing indexed folder an the new indexed folder. Finally I delete the new group only in DTPO.
Would it work?
I see that you want to avoid duplicates in the final file structure, but is it implicit that you have duplicates now?
The thought of automating this process somehow sounds attractive, but I get real nervous when automating processes that touches thousands of documents at once. I’d do this manually myself.
I think it is easier to do all the work inside the database. If feasible, I would move all of the indexed documents into the database. Next, As you suggested, I would create a new folder in the file system and index it (it is empty, at first). Let’s call it “New Parent”. Then proceed to move documents and groups within the database into that New Parent folder. And when that is done, I would select everything inside New Parent, and “move externally”. The result is that New Parent and all of its child folders and documents is indexed.
I think I did it as suggested. But now, when I click on each of the groups in the new parent group the count of items inside it changes to zero. And in fact in the group is none item. What’s wrong? How can I restore all items which formerly are stored only in the database.
I’ve got no backup of the database, but a lot of Time Machine backups.