Adding and verifying checksums to detect inapparent data loss

Checksumming is not completely without pitfalls; I only perform it on files which are locked, which reduces the probability that files which I routinely change are constantly flagged. I also exclude files with a nocheck tag, which allows me to mark and exclude files which I change only occasionally, which would however be locked by my locking rule. How useful a checksum routine is probably depends on the type of data a user produces and how they interact with that data.

I’ve actually updated my script a little along the way; here is the current version, which includes some improvements and excludes rtfd files, as they are a file bundle and I can’t create a checksum using the methods I use for all other files:

property pTag : "Checksum"

on performSmartRule(theRecords)
	tell application id "DNtp"
		try
			set theCount to 0
			show progress indicator "Processing Checksums" cancel button 1 steps count of theRecords
			repeat with theRecord in theRecords
				if type of theRecord is not rtfd then
					if cancelled progress then error number -128
					step progress indicator (name of theRecord) as string
					-- if available get saved Checksum from record
					set md to custom meta data of theRecord
					try
						set OldCheck to mdsha1 of md
					on error
						set OldCheck to ""
					end try
					-- get the current Checksum of the record
					set thePath to path of theRecord as string
					set CheckSum to do shell script "/usr/bin/openssl sha1 " & quoted form of thePath
					set CheckSum to texts ((offset of "= " in CheckSum) + 2) thru -1 of CheckSum
					-- set the Checksum if none previously set - otherwise compare previous and current, warn and add "Checksum" tag if discrepancy
					if OldCheck is equal to "" then
						add custom meta data CheckSum for "SHA1" to theRecord
					else if OldCheck is not equal to CheckSum then
						set theDialog to "Record " & name of theRecord & "
Has Changed! Reset Checksum?"
						set AskUser to display alert "Checksum Error!" message theDialog as critical buttons {"Fail", "Reset"} default button "Fail" giving up after 30
						if button returned of AskUser is "Reset" then
							add custom meta data CheckSum for "SHA1" to theRecord
							-- tags routine adapted from suavito, posted May 2020 https://discourse.devontechnologies.com/t/applescript-to-delete-tags/9583/17
							-- remove "Checksum" tag if Checksum is reset
							set theNewList to {}
							set theList to tags of theRecord
							repeat with n from 1 to count of theList
								set theNewItem to item n of theList
								if theNewItem is not pTag then set theNewList to theNewList & theNewItem
							end repeat
							set tags of theRecord to theNewList
						else
							set theCount to theCount + 1
							set tags of theRecord to tags of theRecord & pTag
						end if
					end if
				end if
			end repeat
			display notification (theCount as string) & " Records Failed" with title "Processed Checksums"
			if theCount > 0 then log message "Checksum Verification" info "Found " & theCount & " Errors!"
		on error error_message number error_number
			hide progress indicator
			if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
		end try
		hide progress indicator
	end tell
end performSmartRule

It is run by the following smart rule:

As originally posted, it requires a custom single-line text metadata field called “SHA1” with identifier “sha1”.