No duplicate recognition

#1

I’m seeing PDF files that are definitely duplicates not being designated as such. Same exact Creation and Modification dates, but not seen as dupes. Not in the inspector and not color change for the filenames.

#2

Is the word count of the files identical? See e.g. navigation bar above preview pane.

#3

OK, I’m confused here. In this case I downloaded a PDF file from online and did nothing to it. Yet the word count between one copy already in DT and one being imported is 6 words off. Confusing. Also, using the same 2 files in DT2 (one already imported, one to be) had them both show in blue as dupes in the Inbox.

Also in DT3 if I try t import a file I know is a dupe it appears in the Inbox then “poof” it disappears. Is this a new feature?

#4

Did you import the PDF documents on the same macOS version? Depending on the version macOS’ PDFKit results might vary. Finally, is the smart rule “Filter Duplicates” performed on import?

#5

I did recently update the OS, but I believe this was happening before that. I’ll monitor my files and see if there’s really an issue here.

The Smart rule was performing on imports. I removed this since in the past I’ve found that with DT2, PDFs save from the web from different accounts from the same site were designated as dupes. Unless something has changed in DT3 I don’t want have a file unnecessarily trashed.

#6

Hi, the same is happening right now with me (macOS 10.14.5):

always the same scanner,
always the same document,
3 duplex scans with page 1 front,
3 duplex scans with side 2 front,
3 duplex scans with page 2 flipped 180º

results in 9 scans, 9 different file sizes and 9 different word counts.

How come?

#7

In case that you scanned each page multiple times the quality might slightly vary and this could affect the OCR engine. But if you scanned each page once and duplicated the scans in the Finder, then the results should be identical.