Date Highlighting in PDFs

You could use the toolbar search, DEVONthink automatically jumps to the first match in each document.

Example queries:

[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9][0-9]

→ “10.10.2020”

[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]

→ “2020-10-10”

You can chain queries with a |

[0-9][0-9].[0-9][0-9].[0-9][0-9][0-9][0-9]|[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]

→ “10.10.2020” and “2020-10-10”

Since DEVONthink 3.6 it’s also possible to do this kind of search in Inspector > Search with option Enable Wildcards and Operators.

If you add your keywords to the query they’ll be highlighted.

In these cases I use an AppleScript to change the creation date to the selected text. It creates a temp record in order to let DEVONthink parse the date for me. However sometimes this won’t work due to bad document quality. When it failed copy the selected text and paste it somewhere - you’ll see that there’s actually no date in the OCR layer where you see one in the document.

-- Set creation date to selected text (via temp record)

tell application id "DNtp"
	try
		set theRecords to selected records
		if theRecords = {} then error "Nothing selected"
		if (count theRecords) > 1 then error "Please select a date in a record"
		set theRecord to item 1 of theRecords
		
		try
			set selectedText to selected text of window 1 & "" as string
		on error
			error "No text selected"
		end try
		
		set theTempRecord to create record with {name:"Temp - " & selectedText, type:text, plain text:selectedText} in incoming group
		set theDates to all document dates of theTempRecord
		
		if theDates ≠ {} then
			set theDate to item 1 of theDates
			set creation date of theRecord to theDate
			display notification "Creation date via temp record"
			delete record theTempRecord
		else
			error "No date found"
		end if
		
	on error error_message number error_number
		if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
		return
	end try
end tell

It’s a good idea to also manually scan the text as DEVONthink can only find what’s in the OCR layer.

That’s tedious, I know, however you just have to verify the date - when I first posted in this forum there was no automatic document date extraction built in.

Here’s my first post

When I finally had cobbled together my own date extraction script (after years …) DEVONthink introduced the document date feature. Countless hours wasted …

However

that’s true :smile:

1 Like