DT3 Smart rule to filter duplicates, acting on ALL duplicate occurrences

I have a smart rule to filter duplicates. The rule acts on ALL duplicate occurrences. Shouldn’t it leave one occurrence, presumably the most recent one, untouched?

I created this rule by duplicating the Filter Duplicates rule that DT3 comes with. I then changed the actions from “Move to Trash” to “Move to [folder]” and added “Change name to 'Duplicate - [Name].” (The original “filter duplicates” does leave one occurance.)

Even when taking out the “Change name” action, all occurrences are acted upon.

The cancel action remains as the last action.

The matching rules are as the “filter duplicates” rule came with:

  • kind:any document
  • item is: duplicated.

Update: I duplicated the original “filter duplicates” rule again, then the only change I made was, I changed “move to trash” to “move to [folder].” The rule still acted on ALL occurrences, whereas the original “filter duplicates” rule leaves one occurrence untouched.

Is the event trigger set to On Import on your duplicated rule?

This is totally unexpected and has caused a real problem for me. I exepected one original to remain, but every instance of the duplicate was oved to the trash. Worse, undo doesn’t is grayed. I might have to manually move over a 1500 files to their original locations, unless I can revert somehow.

I might have to manually move over a 1500 files to their original locations

Might I ask how you end up with 1500 files getting duplicated?

The point being: I ultimately see non-relevant duplicates as an error in the proces before the file enters DT. Building an automated duplication removal proces sounds like a work around for an erroneous import proces.

Instead of cleaning things up, would it be possible to prevent the duplicates from being created?

Thanks for the reply. I believe but am not certain that I will recover the mass trashing by using one of the automated backups. Thank you, DT for making backups automatically.

Moral of the story is to review the scripts before running them, to see what they do. I foolishly assumed that I could run the script on one group, and that one copy of each duplicate would remain. Instead, globally, every file that had a duplicate was trashed. These were some of the most important files. That’s why I duplicated them to multiple groups.

My duplicates were desired, except in sever select groups. I create duplicates intentionally to incorporate files from reference groups into individual case file groups , one to many. That’s how I got so many duplicates.

You might want to look into using replicants instead of duplicates, as that one to many relationship is exactly what replicants are intended to do.

1 Like

This is an issue of the Move action, contrary to trashing/deleting the action is performed on all results. The next release will fix this.

1 Like

Thanks for the replicant suggestion. I do use them too, but copies are usually more helpful for me because I can leave the reference document (e.g., a statute or jury instruction) pristine, and mark up or cut up a copy in multiple individual case files.

Would someone kindly post the original “filter duplicates” smart rule here. I deleted mine after I experienced unintended effects from it. But now I wish to examine it in the “edit” window in order to understand why it did what it did. I could download the whole DEVONThink application again, but I’d rather not.

Thanks in advance.

1 Like

Thank you.

On DT3.0.4 (macOS 10.15.4), I am seeing an inconsistency in the results of the ‘Duplicates’ Smart Folder (which is the same as the Smart Rule except it doesn’t trigger an action).

For some reason, there are some instances of duplicates where both items appear in the Smart Rule results (and each shows the other as its duplicate), and other instances where only one of the duplicates (but which one???) appears in the results. Is there some nuance about the way certain duplicates are identified that is causing this behavior? What is the expected result list of the Duplicates Smart Folder supposed to show?

Screenshots:

  1. This list shows 6 items that have duplicates not listed in these results, but 2 items that themselves are duplicates of one another and are both listed here:

  2. This shows that each of these is recognized as the duplicate of the other:

EDIT:

To add one more strange thing happening with duplicates, I just found that in this same database, there are yet other duplicates that are being identified (text changed to blue) but which are not appearing in the Duplicates smart folder:

To be sure, here is the criteria for the Duplicates smart folder:

TIA!

Does it work after selecting a valid “Search in” scope?

Unfortunately, no. I assume you’re referring to the issue of what appears in the results of the Duplicate Items saved-search folder. Apart from that, have you seen before the other behavior I showed in my screenshots, with some duplicate items not appearing even once in the Duplicate Items saved-search results? Any other troubleshooting suggestions?

To the extent it could be relevant, this is in a database that is indexed items only except for the database’s Inbox.

Are some of the duplicated items excluded from searching, see Info inspector/popover? Or maybe the enclosing groups of the items?

Nope, I just confirmed that none of my groups are excluded from searching and none of the files identified as dupes but not appearing in the Duplicates saved-search results are excluded from searching.

FWIW, I upgraded to DT3.5 today and the problem persists. Also, I tried turning off the “strict recognition of duplicates” option in Preferences to see whether that made any difference; it didn’t.

Could you send the files to cgrunenberg - at - devon-technologies.com? Then I’ll check this over here. Thanks!

Since we’re on the topic of duplicates. They are still showing up for me too. I had thought the update took care of it but that was not the case. Oddly enough I’ve noticed that it seems to be the latest iteration of the folder that I want to keep because the one with out the -1 attached to it has nothing in it it’s just a duplicated folder structure, with maybe a duplicated file in there occasionally, whereas the “original indexed folder”-1, has all the info that I want to keep which makes more work for me since I ether need to rename it, or be extra cautious that I don’t accidentally delete the wrong batch. .