Help with importing emails with attached .pdf files to DTPO

I am using Devonthink as my main source to store information in. Every day I either

  • Keep emails that need attention in my Mail.app
  • Move them over to my DT database
  • Delete them

I have found that there is two scripts - one in the Mail.apps message menu “Add to DEVONthink Pro Office” and then another set of scripts in my menubar in Finder with four mail scripts in the end.

What I need help with is:

If I receive an email with an attached .pdf file I want to store both the text in the email but I also want Devonthink to save the .pdf file and convert it to a searchable .pdf.

Earlier I could use the script: Add messages and attachments to Devonthink. When doing so Devonthink created a new group with two messages in - one consisting of the email and the other one just the .pdf.

Now that script does not work so I just use the first one “Add to DEVONthink Pro Office” inside the Message menu in Mail and then the email WITH the .pdf is imported.

However, to be able to see this .pdf on my iPad or iPhone is impossible. And I do not know how to make the .pdf OCR:ed either.

I am sure that in the new version of DTTG this will be fixed but I still like to be able to convert the .pdfs imported inside an mail to the database regardless.

Hope that I will learn how to make my own scripts this summer but until then - any tips?
Screen Shot 2015-05-07 at 13.39.14.png

These are actually commands added by the plugin for Apple Mail.

Most PDF documents are already searchable, usually it’s only necessary to OCR scanned documents.

Anyway, here’s a simple script to import the attachments of already imported and selected emails:


tell application id "DNtp"
	set theSelection to the selection
	set tmpFolder to path to temporary items
	set tmpPath to POSIX path of tmpFolder
	
	repeat with theRecord in theSelection
		if type of theRecord is unknown and path of theRecord ends with ".eml" then
			set theRTF to convert record theRecord to rich
			
			try
				if type of theRTF is rtfd then
					set thePath to path of theRTF
					set theGroup to parent 1 of theRecord
					
					tell application "Finder"
						set filelist to every file in ((POSIX file thePath) as alias)
						repeat with theFile in filelist
							set theAttachment to POSIX path of (theFile as string)
							
							if theAttachment does not end with ".rtf" then
								-- Importing skips files inside the database package,
								-- therefore let's move them to a temporary folder first
								set theAttachment to move ((POSIX file theAttachment) as alias) to tmpFolder with replacing
								set theAttachment to POSIX path of (theAttachment as string)
								tell application id "DNtp" to import theAttachment to theGroup
							end if
						end repeat
					end tell
				end if
			on error msg
				display dialog msg
			end try
			
			delete record theRTF
		end if
	end repeat
end tell

Thank you! Danke! Díky!

Dear Christian,

thank you for your great script. This is something I was really searching for since Yosemite gives me errors whenever I try to import a message AND/OR an attachment (with the built in services). With your great script I can synchronize my eMail-accounts and store the (readable) attachments in the same place (an create replicants wherever I need them in my database.

Is it possible to give me hints to change your script to do the following?
I would like to have the newly created attachment to have the following attributes:

  1. the same creation-date as the original eMail
  2. the same “from”, “to”, “URL” as the original eMail
  3. filename “‘Subject of the eMail’ - Attachment No. # - ‘Original name of the attachment’” where # should be counted to the number of the attachments and ‘Subject of the eMail’ should be, you guessed it, the subject of the eMail. The ‘Original name of the attachment’ is the filename of the attachment, which is already working.

Thank you in advance.

lxc

Btw. I used to use a modified (by myself) script of yours to import messages and attachments with the following script, which does not work in Yosemite anymore. So I think your service-script acting in DEVONthink itself might be the better solution. Unfortunately I can’t figure out how to get a version of this working “within DEVONthink”. Maybe with a little help from you? :confused:


-- Import selected Mail messages & attachments to DEVONthink Pro.
-- Created by Christian Grunenberg on Mon Mar 05 2012.
-- Copyright (c) 2012-2014. All rights reserved.
-- edited by lx, 2014 (Thanks to Christian Grunenberg)

-- this string is used when the message subject is empty
property pNoSubjectString : "(no subject)"

tell application "Mail"
	try
		tell application id "DNtp"
			if not (exists current database) then error "No database is in use."
		end tell
		set theSelection to the selection
		set theFolder to (POSIX path of (path to temporary items))
		if the length of theSelection is less than 1 then error "One or more messages must be selected."
		repeat with theMessage in theSelection
			my importMessage(theMessage, theFolder)
		end repeat
	on error error_message number error_number
		if error_number is not -128 then display alert "Mail" message error_message as warning
	end try
end tell

on importMessage(theMessage, theFolder)
	tell application "Mail"
		try
			tell theMessage
				set {theDateReceived, theDateSent, theSender, theSubject, theSource, theReadFlag} to {the date received, the date sent, the sender, the subject, the source, the read status}
				set {prevDelims, AppleScript's text item delimiters} to {AppleScript's text item delimiters, " "}
				set theSender to (extract address from sender of theMessage) -- verwenden, um die eMail-Adresse des Absenders im Subject zu erhalten
				--				set theSender to (extract name from sender of theMessage) -- verwenden, um den Namen des Absenders im Subject zu erhalten
				set theRecipient to address of to recipients of the theMessage -- verwenden, um die eMail-Adresse des Empfängers im Subject zu erhalten
				--				set theRecipient to name of to recipients of the theMessage -- verwenden, um den Namen des Empfängers im Subject zu erhalten
				-- the following was added due to possible problems with "/" in the filename (since it may be part of the subject) of the attachment
				set AppleScript's text item delimiters to "/"
				set ti to every text item of theSubject
				set AppleScript's text item delimiters to "_"
				set theSubject to ti as string
				-- end of problemsolving
				set AppleScript's text item delimiters to prevDelims
				set laufendeNummer to 0
			end tell
			if theSubject is equal to "" then set theSubject to pNoSubjectString
			set theAttachmentCount to count of mail attachments of theMessage
			tell application id "DNtp"
				set theGroup to incoming group
				--				if theAttachmentCount is greater than 0 then set theGroup to create record with {name:theSubject, type:group} in theGroup
				create record with {name:theSender & " > " & theRecipient & " - " & theSubject & ".eml", type:unknown, creation date:theDateSent, modification date:theDateReceived, URL:theSender, source:(theSource as string), unread:(not theReadFlag)} in theGroup
			end tell
			repeat with theAttachment in mail attachments of theMessage
				set laufendeNummer to laufendeNummer + 1
				set theFile to theFolder & theSender & " > " & theRecipient & " - " & theSubject & " - Anhang Nr. " & laufendeNummer & " - " & (name of theAttachment)
				tell theAttachment to save in theFile
				tell application id "DNtp"
					set theAttachmentRecord to import theFile to theGroup
					set unread of theAttachmentRecord to (not theReadFlag)
					set URL of theAttachmentRecord to theSender
					set creation date of theAttachmentRecord to theDateSent
					set modification date of theAttachmentRecord to theDateSent
					--					set modification date of theAttachmentRecord to theDateReceived
					--					set source to (theSource as string)
				end tell
			end repeat
		on error error_message number error_number
			if error_number is not -128 then display alert "Mail" message error_message as warning
		end try
	end tell
end importMessage

Your script is working over here on Yosemite 10.10.3. Messages are imported. What “does not work” on your computer?

I’m not sure what this means – however your script is meant to be run from Mail, with a selection of message. It is not meant to be run from DEVONthink.

Dear Korm,

thank you for your answer.

I almost every time get the following error:

Mail

Mail got an error: AppleEvent handler failed.

Since this does not happen all the time (Running Yosemite 10.10.3), my script was not reliable.

I know i have to run my above mentioned script from within Mail. But since I had to do this manually, it was not perfect at all.

I always wanted to be able to “automatically import” a mailbox, but I was not sure how to get the attachments out of the mails in an “indexable” way.

Now I wrote a script (based on the one (in Post Nr. 2) from cgrunenberg), which does the following:

  1. It takes all the eMails in the selected area of DEVONthink
  2. It extracts all attachments into the same Group/Folder of DEVONthink
  3. It names all attachment “Subject” - Att. #x - “Filename of the Attachment”
    where x is counted up from 1 to y
  4. It changes the modification date of the attachment to the modification date of the original eMail
  5. It changes the creation date of the attachment to the creation date of the original eMail
  6. It has a “URL” that is referenced to the original eMail (this way I always know where the attachment was coming from with a single click)
  7. Last but not least: It converts non-searchable pdf-files to searchable pdf-files!

This way my attachments including the eMails can be sorted by name or date and will be next to the original email.

AND the attachments are indexed!

Here is my code.

Thanks again,

lxc

P.S.: My script removes the “/” from any filename and replaces it with _, since with the “/” I had serious problems with the POSIX-path.

tell application id "DNtp"
	set theSelection to the selection
	if theSelection is {} then error "Please select some contents."
	repeat with everyFile in theSelection
		set theCurrentName to the name of everyFile
		set AppleScript's text item delimiters to "/"
		set text_item to every text item of theCurrentName
		set AppleScript's text item delimiters to "_"
		set theCurrentName to text_item as string
		set the name of everyFile to theCurrentName
		
	end repeat
	
	set tmpFolder to path to temporary items
	set tmpPath to POSIX path of tmpFolder
	
	
	repeat with theRecord in theSelection
		--
		set theCreationDate to creation date of theRecord
		set theModificationDate to modification date of theRecord
		set theURL to reference URL of theRecord
		set runningNumber to 0
		--
		if type of theRecord is unknown and path of theRecord ends with ".eml" then
			set theRTF to convert record theRecord to rich
			try
				if type of theRTF is rtfd then
					set thePath to path of theRTF
					set theGroup to parent 1 of theRecord
					
					tell application "Finder"
						set filelist to every file in ((POSIX file thePath) as alias)
						repeat with theFile in filelist
							set runningNumber to runningNumber + 1
							set theAttachment to POSIX path of (theFile as string)
							if theAttachment does not end with ".rtf" then
								-- Importing skips files inside the database package,
								-- therefore let's move them to a temporary folder first
								set theAttachment to move ((POSIX file theAttachment) as alias) to tmpFolder with replacing
								set theAttachment to POSIX path of (theAttachment as string)
								--								
								tell application id "DNtp"
									-- import the File into the database, in the group the original eMail is in
									set importedFile to import theAttachment to theGroup
									-- count the words of the file (pdf)
									set theWordCount to word count of importedFile
									if theWordCount is 0 then
										-- images have a word count of 0, so we have to check if it's a pdf
										if theAttachment ends with ".pdf" then
											try
												-- timeout of 2 hours since there might bw large files to be converted
												with timeout of 7200 seconds
													-- we set a new name for the converted file
													-- an let DEVONthink convert the file
													set convertedFile to convert image record importedFile to theGroup
													-- We want creation date, modification date to be the same as the original eMail
													set creation date of convertedFile to theCreationDate
													set modification date of convertedFile to theModificationDate
													-- we set a reference to the original eMail
													set URL of convertedFile to theURL
													-- Naming-scheme of the new file: 'Subject' - 'attachment #' - 'Original name of the attachment'
													set name of convertedFile to (name of theRecord) & " - Anh. #" & runningNumber & " - " & (name of importedFile)
													-- we delete the original, not OCRed pdf
													delete record importedFile
												end timeout
											end try
										else
											-- we rename and modify dates of files with word count of 0 that are not pdfs
											set creation date of importedFile to theCreationDate
											set modification date of importedFile to theModificationDate
											-- we set a reference to the original eMail
											set URL of importedFile to theURL
											-- Naming-scheme of the new file: 'Subject' - 'attachment #' - 'Original name of the attachment'
											set name of importedFile to (name of theRecord) & " - Anh. #" & runningNumber & " - " & (name of importedFile)
										end if
									else
										-- we rename and modify dates of files with word count bigger than 0
										set creation date of importedFile to theCreationDate
										set modification date of importedFile to theModificationDate
										set URL of importedFile to theURL
										set name of importedFile to (name of theRecord) & " - Anh. #" & runningNumber & " - " & (name of importedFile)
									end if
									if theAttachment contains "single card.png" then
										delete record importedFile
									end if
									
								end tell
								--
								
							end if
						end repeat
					end tell
				end if
			on error msg
				display dialog msg
			end try
			
			delete record theRTF
		end if
	end repeat
end tell