OCR on PDF files attached to e-mails?

Thanks.

I total have no experience with scripts. What steps should I follow to adjust the script that it adds only PDF documents ?

And would it be possible that I make a map inside the database called “pdf attachments” and that the scripts will put all pdf’s in that map ?

E.g. like this…

-- Import attachments of selected emails or formatted notes

tell application id "DNtp"
	set theSelection to the selection
	set tmpFolder to path to temporary items
	set tmpPath to POSIX path of tmpFolder
	
	repeat with theRecord in theSelection
		if (type of theRecord is unknown and path of theRecord ends with ".eml") or (type of the record is formatted note) then
			set theRTF to convert record theRecord to rich
			
			try
				if type of theRTF is rtfd then
					set thePath to path of theRTF
					set theGroup to parent 1 of theRecord
					
					tell application "Finder"
						set filelist to every file in ((POSIX file thePath) as alias)
						repeat with theFile in filelist
							set theAttachment to POSIX path of (theFile as string)
							
							if theAttachment ends with ".pdf" then
								-- Importing skips files inside the database package,
								-- therefore let's move them to a temporary folder first
								set theAttachment to move ((POSIX file theAttachment) as alias) to tmpFolder with replacing
								set theAttachment to POSIX path of (theAttachment as string)
								tell application id "DNtp" to import theAttachment to theGroup
							end if
						end repeat
					end tell
				end if
			end try
			
			delete record theRTF
		end if
	end repeat
end tell

A map? Do you mean a group?

yes sorry a group.

I would like to call this group : PDF BIJLAGEN

You could replace this…

tell application id "DNtp" to import theAttachment to theGroup

…with…

tell application id "DNtp"
	set theGroup to create location "/PDF BIJLAGEN" in (database of theRecord)
	import theAttachment to theGroup
end tell

Like I say I have no experience with scripts…

so like this ?

– Import attachments of selected emails or formatted notes

tell application id “DNtp”

set theSelection to the selection

set tmpFolder to path to temporary items

set tmpPath to POSIX path of tmpFolder

repeat with theRecord in theSelection

if (type of theRecord is unknown and path of theRecord ends with “.eml”) or (type of the record is formatted note ) then

set theRTF to convert record theRecord to rich

try

if type of theRTF is rtfd then

set thePath to path of theRTF

set theGroup to parent 1 of theRecord

tell application “Finder”

set filelist to every file in (( POSIX file thePath) as alias )

repeat with theFile in filelist

set theAttachment to POSIX path of (theFile as string )

if theAttachment ends with “.pdf” then

– Importing skips files inside the database package,

– therefore let’s move them to a temporary folder first

set theAttachment to move (( POSIX file theAttachment) as alias ) to tmpFolder with replacing

set theAttachment to POSIX path of (theAttachment as string )

set theGroup to create location “/PDF BIJLAGEN” in (database of theRecord)

import theAttachment to theGroup

end if

end repeat

end tell

end if

end try

delete record theRTF

end if

end repeat

end tell

seems not to be working…

where do I put this script and how should I run it ?

You didn’t insert the first/last line of the replacement snippet.

-- Import attachments of selected emails or formatted notes

tell application id "DNtp"
	set theSelection to the selection
	set tmpFolder to path to temporary items
	set tmpPath to POSIX path of tmpFolder
	
	repeat with theRecord in theSelection
		if (type of theRecord is unknown and path of theRecord ends with ".eml") or (type of the record is formatted note) then
			set theRTF to convert record theRecord to rich
			
			try
				if type of theRTF is rtfd then
					set thePath to path of theRTF
					set theGroup to parent 1 of theRecord
					
					tell application "Finder"
						set filelist to every file in ((POSIX file thePath) as alias)
						repeat with theFile in filelist
							set theAttachment to POSIX path of (theFile as string)
							if theAttachment ends with ".pdf" then
								-- Importing skips files inside the database package,
								-- therefore let's move them to a temporary folder first
								set theAttachment to move ((POSIX file theAttachment) as alias) to tmpFolder with replacing
								set theAttachment to POSIX path of (theAttachment as string)
								tell application id "DNtp" to import theAttachment to theGroup
							end if
						end repeat
					end tell
				end if
			end try
			
			delete record theRTF
		end if
	end repeat
end tell

I have created a test database TEST MAIL with a group PDF BIJLAGEN

No idea how I should run the script now (and if above script is correct now ?)

Select a bunch of emails, then execute the script (e.g. first in the Script Editor.app).

Nothing happens… what am I doing wrong ?

Which edition do you use? Anything logged to Windows > Log?

The latest version 3.5.1

Windows > Log =

When I run the script form the Script Editor app :

Does it work after enabling full disk access in system preferences?

I have enabled now full disk access but it doesn’t work.
Next thing that I tried was that I run the script directly from the script editor and now it seems to work.

  • the log does not show any activity
  • script only works when running from the script editor
  • If I select the e-mails and run the script 2 or 3 times - it does not skip the pdf’s that are already in the map PDF BIJLAGEN

Which version of DEVONthink and of macOS do you use and what’s selected? Over here (DEVONthink 3.5.1, macOS 10.15.5, 1 email selected containing a PDF) it works in both cases.

I also have :
Devonthink 3.5.1
MacOS 10.15.1

I selected 3 e-mails - and only works in script editor mode.

Should I reboot the Mac after I have given full access ?

Updating to 10.15.5 might be a good idea, 10.15.1 is quite old and buggy.

Sorry I made a typing mistake - I also have 10.15.5

Full disk access is definitely required but restarting the app should be sufficient.

I have rebooted but no luck.