How to search email attachments?

I’ve successfully imported all my email and associated attachments from Apple Mail into Devonthink. If an email has attachments, then it’s got it’s own group, inside of which are all it’s attachments as separate files.

What I’d now like to do is implement a quick search that does the following. Find all jpg attachments to emails from John Doe in the last year.

Would an applescript be needed to implement this? If so, what might it look like?

Thanks!

Email attachments are not yet indexed/searchable. Therefore the only possibility is to add the attachments as separate items to the database (e.g. via the script “Add message(s) & attachments to DEVONthink”).

Yes, I’ve done that and the attachments are separate items. And correct me if I’m wrong but the only thing that links an attachment item to it’s associated email is a common group.

I’m still uncertain how to implement the following query.
What are all the jpg attachments to emails from John Doe?

Thanks for the help!

That’s right.

This is not yet possible. But you could use this script instead:


-- this string is used when the message subject is empty
property pNoSubjectString : "(no subject)"

tell application "Mail"
	try
		tell application id "com.devon-technologies.thinkpro2"
			if not (exists current database) then error "No database is in use."
		end tell
		set theSelection to the selection
		set theFolder to (POSIX path of (path to temporary items))
		if the length of theSelection is less than 1 then error "One or more messages must be selected."
		repeat with theMessage in theSelection
			my importMessage(theMessage, theFolder)
		end repeat
	on error error_message number error_number
		if error_number is not -128 then display alert "Mail" message error_message as warning
	end try
end tell

on importMessage(theMessage, theFolder)
	tell application "Mail"
		try
			tell theMessage
				set {theDateReceived, theDateSent, theSender, theSubject, theSource, theReadFlag} to {the date received, the date sent, the sender, subject, the source, the read status}
			end tell
			if theSubject is equal to "" then set theSubject to pNoSubjectString
			set theAttachmentCount to count of mail attachments of theMessage
			tell application id "com.devon-technologies.thinkpro2"
				set theGroup to incoming group
				if theAttachmentCount is greater than 0 then set theGroup to create record with {name:theSubject, type:group} in theGroup
				set theRecord to create record with {name:theSubject & ".eml", type:unknown, creation date:theDateSent, modification date:theDateReceived, URL:theSender, source:(theSource as string)} in theGroup
				set unread of theRecord to (not theReadFlag)
			end tell
			repeat with theAttachment in mail attachments of theMessage
				set theFile to theFolder & (name of theAttachment)
				tell theAttachment to save in theFile
				tell application id "com.devon-technologies.thinkpro2"
					set theAttachmentRecord to import theFile to theGroup
					set unread of theAttachmentRecord to (not theReadFlag)
					set URL of theAttachmentRecord to theSender
				end tell
			end repeat
		on error error_message number error_number
			if error_number is not -128 then display alert "Mail" message error_message as warning
		end try
	end tell
end importMessage

This script saves the sender’s address in the URL of the attachments. Afterwards you could create a smart group like…

URL matches name
Kind is not email

…and this should return all attachments from the desired person.

Hi Christian,
thanks for the script.

I tried it out but found some issues:

  1. The mail (.eml record) should contain inside the URL the Sender of the mail. This property is correctly retrieved from the message (checked in the Events/Replies of Applescript Editor) but it seems that DT gets confused when creating the record. The URL contains a string built up of: mail-to + subject + in-reply-to + URL of sender website.
    Note that mail-to is different from TheSender string (the former being a “pure” address, the latter being a description + the address).

  2. I tried storing inside the URL of each attachment a DT link to the mail (instead of the sender as per your script).
    I added the instruction


set MessageDTPOLink to ("x-devonthink-item://" & uuid of theRecord) as string
 

after the .eml record is created but I get again a strange string (it doesn’t even contain the “x-devonthink-item://” part) composed of a number + URL of sender website. This time it seems (checking in the Events/Replies of Applescript Editor) that I’m not even able to get the right uuid of the .eml record.

Please note that issue #1 is present also when I run the script “Add message(s) and Attachments to DEVONthink” that comes with DTPO.

I’m running last version of DTPO on 10.7.5.
Am I the only one having problems?
If not, any suggestion to correct the script?

That’s actually a feature of DEVONthink Pro Office which is using such URLs for all archived emails to easily reply to them.

It’s recommended to use the “reference URL” property instead of creating the link on your own, e.g. to ensure that invalid characters are replaced with percent escapes.

Everything’s clear now!

Thanks again for your support.

Here’s something else I found very useful. It’s a slight modification to the standard mail import script that adds a tag to the email and attachment signifying as much. This way, for example, I can search for only emails with attachments.



-- Import selected Mail messages & attachments to DEVONthink Pro.
-- Created by Christian Grunenberg on Mon Mar 05 2012.
-- Copyright (c) 2012-2014. All rights reserved.

-- this string is used when the message subject is empty
property pNoSubjectString : "(no subject)"

tell application "Mail"
	try
		tell application id "DNtp"
			if not (exists current database) then error "No database is in use."
		end tell
		set theSelection to the selection
		set theFolder to (POSIX path of (path to temporary items))
		if the length of theSelection is less than 1 then error "One or more messages must be selected."
		repeat with theMessage in theSelection
			my importMessage(theMessage, theFolder)
		end repeat
	on error error_message number error_number
		if error_number is not -128 then display alert "Mail" message error_message as warning
	end try
end tell

on importMessage(theMessage, theFolder)
	tell application "Mail"
		try
			tell theMessage
				set {theDateReceived, theDateSent, theSender, theSubject, theSource, theReadFlag} to {the date received, the date sent, the sender, subject, the source, the read status}
			end tell
			if theSubject is equal to "" then set theSubject to pNoSubjectString
			set theAttachmentCount to count of mail attachments of theMessage
			tell application id "DNtp"
				set theGroup to incoming group
				if theAttachmentCount is greater than 0 then set theGroup to create record with {name:theSubject, type:group} in theGroup
				create record with {name:theSubject & ".eml", type:unknown, creation date:theDateSent, modification date:theDateReceived, URL:theSender, source:(theSource as string), unread:(not theReadFlag), tags:"withAttach"} in theGroup
			end tell
			repeat with theAttachment in mail attachments of theMessage
				set theFile to theFolder & (name of theAttachment)
				tell theAttachment to save in theFile
				tell application id "DNtp"
					set theAttachmentRecord to import theFile to theGroup
					set unread of theAttachmentRecord to (not theReadFlag)
					set URL of theAttachmentRecord to theSender
					set tags of theAttachmentRecord to "isAttach"
				end tell
			end repeat
		on error error_message number error_number
			if error_number is not -128 then display alert "Mail" message error_message as warning
		end try
	end tell
end importMessage



Thanks for sharing. :smiley:

Back in 2013 cgrunenberg ‘administrator’ stated:

“Email attachments are not yet indexed/searchable.”

Has this moved on yet?

I have accumulated quite a vast email archive and I really want to be able to use Devonthink Pro Office to be able to search the archive and attachments… there’s some great material in there but finding it is now getting quite hard.

Any news?

Paul

Sorry, but no - attachments are opaque to to indexing. They can and should be imported separately to be indexed.

Hi,

Thanks for the speedy reply. I think perhaps I should rephrase my question.

Is indexing email attachments on the roadmap, and if so is there an ETA?

If it’s not planned, or can’t happen due to whatever reasons can you let us know and I’ll start looking at the workarounds discussed earlier in this thread.

I don’t want to invest in a work-around for 18GB if you’ve got this in the wings, but if it’s never going to happen I’ll have to find a plan.

The replies in 2013 stated ‘not yet’ which made me hold out, but now DTTG is up and running I want to get this sorted.

Many thanks,

We never say “never”, but in this case, it’s not likely to happen any time soon. Part of the issue is the contents of the attachment is not part of the email itself. The email has a reference to a file in its contents, but the contents belong to the attachment. Attachments should be imported separately so they can be indexed too.

Does anybody have a good way to keep the connection between a separately imported attachments and the email it belongs to? Often the text in the mail body connects to the attachment and I don’t want to loose this info.

Thank you, Bernd

You can keep the email and attachments in the same group. This is the easiest way to solve the problem.

You could also select the email and choose Data > Copy Item Link to get the URL of the file.
Then select the attachment, choose Tools > Show Info and paste that into the URL section of the Info pane. This would establish a connection between the attachment(s) and the email.

You can keep the email and attachments in the same group. This is the easiest way to solve the problem.

You could also select the email and choose Data > Copy Item Link to get the URL of the file.
Then select the attachment, choose Tools > Show Info and paste that into the URL section of the Info pane. This would establish a connection between the attachment(s) and the email.
[/quote]
Thank you for the quick reply!

You’re welcome. :smiley:

I cannot find this script anywhere in my dtp list of scripts or in ‘other scripts or in mail’. Can you tell me where to find it please.

I guess you’ve to upgrade to dtpo.

I have DTPO.