I have a reference library of about 5000 engineering specifications and articles. These documents have been both scanned in then ocr’d or downloaded from the web. Files often have different names. Frequently DT detects the duplicate but often it does not.

Is there any way to tell DT that the two files are in fact a duplicate? Any suggestion as to how to handle this issue?


No, this isn’t possible. Duplicates are automatically detected on the fly by using the indexed data and other metadata. A future release will include an option for a stricter duplicate recognition.

See Also should show if there’s a tight relationship between the files, even if they’re not marked as duplicates. The score would likely be a full green bar.

If “see also” yields some files that are actually dups. What action can I take?


This blog post covers dealing with duplicates: … evonthink/

You can also right-click items in See Also and use the various context menu items.


I have a related issue that I would like some feedback on.

I have a few years worth of bank statements stored in a database. Bank statements are all very similar in layout. DT is labelling files as duplicates that are not. I think it is because they are so similar.

The issue is that the genuine duplicates are lurking in among the non-duplicates that DT is falsely labelling as duplicates and too hard to find.

Will a future release of DT provide users with a way to more tightly control how DT defines duplicates? Or is there something in the current release that I have not noticed?

Thanks for any help you can provide.

A future version will have a more strict duplicate detection option. Thanks for your patience and understanding.

Thank you for your answer.