Is there a way to test for invalid item links within a document

I just discovered by chance today that at least two of my receipt PDFs are no longer in my database. I examined my backups and see that they went missing sometime after March 19,2022.

I don’t know how that happened, but more importantly I now wonder if I have deleted/lost any others.

I generate a sheet periodically (monthly, quarterly, etc) with each row containing details for a specific receipt - including a DEVONthink item link - for all the receipts associated with that time period.

Normally, I can click on a link and immediately bring up that receipt in its own window. When I click on the links for those two receipts nothing happens as you would expect because they are no longer in the database.

I was wondering if there is a way to check such links programmatically to determine if they are still valid. Specifically a test that could be run against a document containing item-links to see which are invalid in the sense that there is no document at the end of the link.

Hopefully that could then be used to generate a list of missing documents that I could (hopefully) manually retrieve from my backups.

Yes. You could use a script to iterate over the rows in your table, getting the DT URL, extracting the UUID from it and then try to get record with uuid (or getRecordWithUuid in JavaScript). If that fails, you know that the record is gone. Although it won’t tell you when, why or where to.

This script takes the selected records’ text (or source) and checks the links. If it finds an invalide link it opens the record in a new window and (currently) stops, i.e. you need to run it again on all selected records. That’s not ideal but it’s easy to change it to e.g. setting a red label or something.

Tested with Markdown, RTF records and sheets.

-- Find invalide item links

use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

tell application id "DNtp"
	try
		set theRecords to selected records
		if theRecords = {} then error "Nothing selected"
		
		repeat with thisRecord in theRecords
			
			set thisRecord_Type to (type of thisRecord) as string
			if thisRecord_Type is in {"markdown", "«constant ****mkdn»"} then
				set thisRecord_Text to plain text of thisRecord
			else
				set thisRecord_Text to source of thisRecord
			end if
			
			set theItemLinkResults to my regexFind(thisRecord_Text, "x-devonthink(-item|-smartgroup|-smartrule)?://\\w{8}-\\w{4}-\\w{4}-\\w{4}-\\w{12}")
			
			repeat with thisItemLink in theItemLinkResults
				set thisLinkedRecord to (get record with uuid thisItemLink)
				if thisLinkedRecord = missing value then
					open window for record thisRecord with force
					activate
					error "Found invalide link"
				end if
			end repeat
			
		end repeat
		
	on error error_message number error_number
		if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
		return
	end try
end tell

on regexFind(theText, thePattern)
	try
		set theString to current application's NSString's stringWithString:theText
		set {theExpr, theError} to current application's NSRegularExpression's regularExpressionWithPattern:(thePattern) options:0 |error|:(reference)
		set theMatches to theExpr's matchesInString:theString options:0 range:{0, theString's |length|()}
		set theResults to {}
		repeat with thisMatch in theMatches
			set thisMatchRange to (thisMatch's rangeAtIndex:0)
			set thisMatchString to (theString's substringWithRange:thisMatchRange) as text
			set end of theResults to thisMatchString
		end repeat
		return theResults
	on error error_message number error_number
		activate
		display alert "Error: Handler \"regexFind\"" message error_message as warning
		error number -128
	end try
end regexFind

Edit: I overlooked that you’re using sheets and wrote the script with Markdown or RTF records in mind. It’s obviously not very useful when used with a sheet, but you could write the missing links into a new record instead of opening the record.

1 Like

Absolutely invaluable: many thanks @pete31.

Stephen

2 Likes

Once again, I am lost for words with the strength of this community. I go to bed at 2 AM and when I wake up, less than 6 hours later, I have the AS and JS nugget for checking links, and even an AppleScript waiting for me!

Many thanks @chrillek and @pete31 for taking the time to help. I will check this out after I finish my coffee.

2 Likes

An upcoming release might support this as it’s much faster to just check the already indexed item links and as this would also support all kinds of documents including item links in e.g. custom metadata.

2 Likes