Input on script to delete duplicate records based on the same name

Hi,

once again I’m hoping for some input on a script that I’ve been trying to run via a smart rule.

The script checks for records with the same name and is intended to delete all but one of the duplicates. Actually, these are not strictly speaking duplicates but for my intents and purposes I have no preference as to which one of the files with the same name remains after running the script.

The script does correctly delete duplicates but then stops processing through the records that are in the smart rule scope due to an error. There is a log entry:

Here’s the script:

on performSmartRule(theRecords)
	tell application id "DNtp"
		set recordNames to {}
		
		repeat with theRecord in theRecords
			
			set recordName to name without extension of theRecord
			
			if recordNames contains recordName then
				delete record theRecord
				log "deleted duplicate" & name without extension of theRecord
			else
				set recordNames to recordNames & recordName
			end if
			
		end repeat
		
		
	end tell
end performSmartRule

I have reproduced this after removing all special characters from file names, so it seems not to be related to incompatible characters.

Any input much appreciated.

Two thoughts

  • run the code in Script Editor to see where exactly it fails
  • use recordName in your log call

Also, posting log messages as image mages reading them a lot harder than necessary, and this one seems to be incomplete, too. They’re just text, so copy/paste is more useful imo

Thanks @chrillek, I’ve made that change. And I’ll keep your point regarding logs in mind in the future.

So it seems that the issue was due to running the script on Markdown items created by converting RTFD items exported from MarginNote. The images in those Markdown documents were encoded strangely for whatever reason, that was apparent from the start but I didn’t think it would affect the script output.

If I first convert the RTFD items to plaintext in order to remove the images, and then convert the plain text items to Markdown, everything works fine.

This is more than good enough for my intents and purposes, for now. Just in case it’s a relevant info, this is what the image encoding in Markdown items looks like after the conversion:

![]( ...etc etc.)  

I doubt that. You were only ever using the name property, not looking at the content of the MD file. Also, what is that strange image encoding? The one you posted is ok in principle (data URL), it just increases file size tremendously.

Well, if this image encoding is removed there are no more issues with the script so it does seem to be at least related in some way…

What I’m doing is to use the function “Export to DevonThink” in MarginNote, which creates RTF files (RTFD if images are contained). And those files I converted to Markdown in DevonThink.

I’m thinking it’s simpler to sort your records by recordName
Then the comparison will just be on two recordNames

You do realize you can have duplicate names with wildly different items in DEVONthink, yeah?

1 Like

I do. However, in this case I have no incentive to keep more than one of these “quasi-duplicates”.

For context: Marginnote basically has its own replication mechanism, and when exporting map contents all instances are created as individual items. Each has its own x-callback link added to the top of the contents, so the notes don’t get recognized as duplicates based on the content hash.

Going via the record name seems like a sensible option to get rid of the “quasi-duplicates”. Record names correspond to the titles of map nodes in MarginNote, therefore they are the same for the replicated map nodes I exported to DevonThink.

Sounds sensible, will keep it in mind as an option for future use cases - thanks. For now, this works for what I need it to do.

Do you also realize the delete command doesn’t put the items in the database’s Trash?

I don’t find an export to DT in Marginanote/iPad. Is that function limited to the Mac version?

Yes, it is. This is how it looks on MacOS:

Yes, I’m intentionally deleting them straight away. Remember the RTF items I’m acting on with the script are just an export of MarginNote contents, so there’s always a backup.

These two lines probably caused the issue. First you delete the record, then you try to use a property of the now missing record.

2 Likes

Ah yes, that makes sense!

So could I just switch the order here?

That’s one option. Or use the value of recordName for logging.

2 Likes