for me, it was nice to have the possibility to separate a mail from its attachment. This way, it was possible to have it searchable within Devonthink, directly accessible, but still linked to the original mail and not occupying 2 time storage.
The new function does not put the attachment into relation with the mail and also not removes it from the mail, so it is duplicated into the database, but without relation. This adds up quickly if you deal with attachments of 2-5 MB regularly.
ChatGPT helped me modifying the initial script. to work with DT4 here is the modified version
it expects the python script in the same location as the apple script
use AppleScript version "2.4"
use scripting additions
use framework "Foundation"
property ca : a reference to current application
property pythonCmd : "/usr/bin/env python3"
property replacedTagName : "attachments-extracted"
tell application "System Events"
set scriptPath to path of (path to me)
set parentFolder to POSIX path of (container of file scriptPath)
end tell
set pythonScriptPath to parentFolder & "/replace-attachments.py"
tell application "Finder"
set replaceCmd to pythonCmd & " " & quoted form of pythonScriptPath & " "
end tell
tell application id "DNtp"
set theSelection to the selection
set tmpFolder to path to temporary items
repeat with theRecord in theSelection
repeat 1 times
-- display dialog "Verarbeite: " & (name of theRecord)
set recordPath to path of theRecord
-- display dialog "Pfad: " & recordPath & return & "Typ: " & (type of theRecord as rich text) & return & "Tags: " & (tags of theRecord as rich text)
if (type of theRecord is email or type of theRecord is unknown) and recordPath ends with ".eml" and (tags of theRecord does not contain replacedTagName) then
try
set foundAttachmentsJSON to do shell script replaceCmd & (quoted form of recordPath)
on error errMsg
display dialog "Fehler beim Python-Skript:" & return & errMsg
exit repeat
end try
if foundAttachmentsJSON is equal to "" then
display dialog "Keine Anhänge vom Python-Skript erkannt."
exit repeat
end if
set foundAttachments to my fromJSON(foundAttachmentsJSON)
-- display dialog "Gefundene Anhänge: " & (foundAttachments as rich text)
set recordReferenceURL to reference URL of theRecord
set recordSubject to name of theRecord
set recordModificationDate to modification date of theRecord
set recordCreationDate to creation date of theRecord
set recordAdditionDate to addition date of theRecord
set recordGroup to missing value
set extractedAttachments to {}
set rtfRecord to convert record theRecord to rich
-- display dialog "RTF-Konvertierungstyp: " & (type of rtfRecord as rich text)
if type of rtfRecord is RTFD then
set rtfPath to path of rtfRecord
tell rich text of rtfRecord
tell application "Finder"
set rtfAttachmentList to every file in ((POSIX file rtfPath) as alias)
-- display dialog "Anzahl Dateien im RTF: " & (count of rtfAttachmentList)
repeat with rtfAttachment in rtfAttachmentList
set rtfAttachmentName to name of rtfAttachment as string
-- display dialog "Datei im RTF: " & rtfAttachmentName
-- display dialog "Vergleiche:" & return & "RTF-Datei: " & rtfAttachmentName & return & "JSON-Anhänge: " & (foundAttachments as text) & return & "RTF (klein): " & my lowercaseText(rtfAttachmentName)
set nameFound to false
repeat with itemName in foundAttachments
if my normalizeText(rtfAttachmentName) = my normalizeText(itemName) then
set nameFound to true
exit repeat
end if
end repeat
if nameFound then
-- display dialog "TREFFER: " & rtfAttachmentName
-- ab hier: move, import usw.
end if
if my lowercaseText(rtfAttachmentName) is in (my lowercaseList(foundAttachments)) then
-- display dialog "TREFFER: " & rtfAttachmentName
set rtfAttachment to move (rtfAttachment as alias) to tmpFolder with replacing
tell application id "DNtp"
if recordGroup is missing value then
set recordGroup to create record with {name:recordSubject, type:group, creation date:recordCreationDate, modification date:recordModificationDate, addition date:recordAdditionDate} in (parent 1 of theRecord)
end if
set movedPath to POSIX path of (rtfAttachment as alias)
-- display dialog "Importiere Datei: " & movedPath
set importedItem to import path movedPath to recordGroup
set URL of importedItem to recordReferenceURL
set modification date of importedItem to recordModificationDate
set creation date of importedItem to recordCreationDate
set end of extractedAttachments to {rtfAttachmentName, ((reference URL of importedItem) as string)}
-- log message "Importiert: " & rtfAttachmentName info "Anhangsextraktion" record importedItem
end tell
end if
end repeat
end tell
if (count of extractedAttachments) > 0 then
set extractedAttachmentsJSON to my toJSON(extractedAttachments)
tell application id "DNtp"
move record theRecord to recordGroup
do shell script replaceCmd & "-r " & quoted form of extractedAttachmentsJSON & " " & quoted form of recordPath
set tags of theRecord to (tags of theRecord) & {replacedTagName}
-- log message "Anhänge ersetzt in: " & recordSubject info "Anhangsextraktion" record theRecord
end tell
end if
end tell
delete record rtfRecord
else
display dialog "RTF-Konvertierung hat kein RTFD geliefert."
end if
end if
end repeat
end repeat
end tell
on normalizeText(t)
-- Entfernt fĂźhrende/trailing Whitespace und wandelt in Kleinbuchstaben
set cleaned to do shell script "/bin/echo " & quoted form of t & " | tr '[:upper:]' '[:lower:]' | sed 's/^ *//;s/ *$//'"
return cleaned
end normalizeText
on fromJSON(strJSON)
set {x, e} to ca's NSJSONSerialization's JSONObjectWithData:((ca's NSString's stringWithString:strJSON)'s dataUsingEncoding:(ca's NSUTF8StringEncoding)) options:0 |error|:(reference)
if x is missing value then error e's localizedDescription() as text
if e â missing value then error e
if x's isKindOfClass:(ca's NSDictionary) then
return x as record
else
return x as list
end if
end fromJSON
on toJSON(theData)
set theJSONData to ca's NSJSONSerialization's dataWithJSONObject:theData options:0 |error|:(missing value)
set JSONstr to (ca's NSString's alloc()'s initWithData:theJSONData encoding:(ca's NSUTF8StringEncoding)) as text
return JSONstr
end toJSON
on lowercaseText(t)
return (do shell script "/bin/echo " & quoted form of t & " | tr '[:upper:]' '[:lower:]'")
end lowercaseText
on lowercaseList(theList)
set outList to {}
repeat with i in theList
set end of outList to my lowercaseText(i)
end repeat
return outList
end lowercaseList
The script does more than just importing something. It takes an already imported .eml, scans it for attachments, imports them into the database, deletes them from the .eml file, and places a link to the file in the .eml and the URL of the mail in the URL field of the file to link them together.
Actually, I donât have a clue if this is the most efficient way of doing this. Probably not. I just took the old script from mdbraber for DT3 and made it work again with DT4.
If this functionality makes it directly into DT, I would be happy to use it.
Well, it is so convoluted that the intention and algorithm are difficult to recognize. Eg: tell finder to set a string to a value? The repeat 1 loop? Converting an email to RTFD so that a python script can access the attachments saved in the conversion process? Lots of smoke for a tiny fire. And some comments would greatly help to understand the code.
Why not something like (in symbolic code)
repeat for r in selected records
if (type of r is email or (type of r is unknown and extension of r is '.eml'))
import attachments record r to target someGroup
post process the attachments in whatever way
set tag of record to "has been processed"
end if
end repeat
That takes, of course, the fun out of it (convert EML to RTFD, read RTFD as JSON, parse JSON, lowercase whatever âŚ). But right now, it looks like a very, very convoluted way requiring many tools (Python and several of its modules) to solve a not so complicated problem.
I fully understand you. I would also prefer a more integrated solution and the script is not mine and I am also not a very skilled programmer especially Apple Script for me is like witch craft. It reads so easy, but to actually get it working is so random in my opinion. And I would have never been able to debug this without ChatGpt or a similar tool. At least not in an acceptable time/afford.
I kept it alive, because it scratches an itch for me, but if the DT developers integrate a functionality to separate and remove attachments from a mail and link them together I would be absolutely happy to simplify this. Just importing and not deleting from the original mail doesnât solve the problem for me.
regarding the loop 1 this was written in the original script the comment got lost somewhen.
I donât know what this is good for.
Iâve tried getting it to run, but I always stumble on some permission issues: Finder asks me to grant it some higher privileges with Touch ID, which I grant, but apparently I donât grant enough, since I get the error:
move alias "Macintosh HD:Users:me:Datenbanken:EmailsDevonThinkDB.dtBase2:Files.noindex:rtfd:e:Einladung.rtfd:bild.JPG" to alias "Macintosh HD:private:var:folders:6b:98zkkkjs1tl4xwid31r84woh0000gn:T:TemporaryItems:" with replacing
--> current application
--> error "Die Aktion konnte nicht abgeschlossen werden, da du nicht die erforderlichen Zugriffsrechte hast." number -5000
I gave DT4 full disk access, but I suspect there is no link between the two, since this is in the Finder part.
If you have any idea how to get it to work, that would be great!
Just for the record, since the question appeared in the conversation, here are my reasons why Iâd like to use this script instead of the built-in functions (and Iâd much rather have the built-in functions do it than fiddle with several scripts):
Separate attachments are treated with OCR and are therefore searchable
With the treatment of this script, there is a backlink between the file and the email (not with the inbuilt function)
In addition, the email and the attachment are grouped together (not with the inbuilt function)
When the attachment gets actually removed from the eml, this allows for space savings through deduplication (e.g. all the forwarded mails with attachments will otherwise take up much more space)
Search works better in separate attachments, even indexed ones (shows where the search text was found)
The file names of separate attachments can be found when searched for (currently, the filenames are not indexed when inside a message)
Hi @smiling, did you run this instance in script editor? If so it might be that you have to grant script editor full disk access for the time of testing this script.
Thank you @AWD ! That was it! I had thought of it, but since the script editor is in a subfolder of the programs folder, I thought that maybe it wasnât possible to give the script editor full disk access. Now it works perfectly!
By the way, for anyone coming across this post, if you get an error
tell application "DEVONthink"
import path "/private/var/folders/6b/(âŚ).pdf" to parent id 83694 of database id 2
--> missing value
Ergebnis:
error "âURL of missing valueâ kann nicht als â\"x-devonthink-item://%(âŚ)\"â gesetzt werden." number -10006 from URL of missing value
then itâs because DT doesnât have full disk access.
Now Iâll try to see if I can make a workflow that works for me.
The python script has a threshold for minimum size of an attachment. By default it is set to 150 kB. This can be the issue. If not: For the cases where it fails what was the file type of the attachment?
Thank you @AWD, I hadnât realised there was a threshold. The threshold being only valid for images, it wasnât the reason.
The reason was that by default, the python script only takes ârealâ attachments, not inline-attachments. And in the cases where it didnât work, the attachments were inline attachments. By adding âinlineâ on line 105, the issue was resolved:
if not part.get_content_disposition() in ['attachment', 'inline']:
Why would you want to extract and save an inline attachment? It is inline so that you can see (in general) an image directly in the e-mail. Often, these are company logos and such.
Apparently, some senders also put PDF and other files as inline attachments. I indeed donât need company logos to be extracted, but thatâs taken care of by the image filter, that filters out images below a certain size.
Weird. The idea of an inline attachment was (I think) that the client could display it directly. Which is difficult to imagine with a PDF. But e-mail is weird anyway.
There used to be a Mail plugin which would make Mail attachments work like âdumbâ attachments, but that was many years ago, and Apple likely blows the world up or whatever these days if you ask about Mail plugins.