I get many documents from email attachments or from web downloads. I import these documents into DEVONthink and rename them to reflect the content.
So for example if I get an invoice from the Coca-Cola company in an email, it might be called inv18654-20241011.pdf. After importing I rename it to “Coca-Cola Invoice 2024-10-11”.
Before I rename document I write into the comment field the old filename, so that I have (via this old name) a link between DEVONthink and my email program.
However, this is not ideal, because the comment field gets utilised for other purposes as well.
Is there another document metadata field for this information (preferably even a non-DEVONthink-proprietary file metadata field).
Or what other idea have people where to store the “provenance” of a document.
If you’re talking about PDFs: They have a title. If you’re talking about other formats: They may or may not have “metadata” at all. Eg, MultiMarkdown provides for metadata, standard MD does not. Word has metadata, Pages does. But for the latter two (and also for PDF): it makes very little sense to store the “name” of the document in there. This “name” is just an arbitrary sequence of bytes used by the local file system. It is not really “meta data” in that it conveys any information about the document.
Personally, I couldn’t care less how Deutsche Telekom or Coca Cola or my banks name their files before they send them to me or offer them for downloading. DKB, for example, names some of their information “2f7d5192-a6a3-4f2a-a6a6-db12adfe92fa.pdf”. What would that even tell me?
Am I going to search for these files on their servers? Hardly. Are they even using “files” to store this information? I hope they don’t.
If you insist, you can just define a DT custom meta data field to hold the “file” name and update that when you import the “file”.
I am here only talking about PDFs - and i want to keep the original filename “somewhere”, but not prominently.
It tells me the link to the original in my mail program.
Sometimes I see an old email asking myself “Did I import this document or not?”[1], then it is good to be able to check whether this document is in DEVONthink – and searching for the original (often, though not always unique) filename is a good and quick indicator
Thanks, this might be good option. – I am just curious what others do if they have the same issue to record where a document comes from.
I do have a tag in my mail prgram for “imported-in-DEVONthink”, but because I have to set this tag manually things go wrong. ↩︎
With document, you mean “PDF attachment”? Otherwise, DT never imports the same e-Mail twice. And for a document – well, there’s the “Duplicate” functionality. More reliable, imo, than a file name that can easily be changed.
I don’t rely on file names at all. And having them in a custom metadata field probably does not help with your issue.
Well, I have PDF attachments in emails, and sometimes I want to check whether I have put this attachment into DEVONthink. – Easiest way is to search for the filename, because that name I can quickly copy from my email client and it is often reasonable unique (sometimes I need to add a topic word which I usually can remember without opening the attachment).
For example, I want to check whether a particular receipt from my iSP is in DEVONthink. The email has an attachment “Invoice_0123452024.pdf”, so I search for Invoice_0123452024 in DEVONthink and – voilà! – there it is if I have put the original filename in the comment field.
This works well, but better is some solution not utilising the comment field. Possibly I haven’t made it before clear enough, but that’s what my initial post is about.
Custom metadata is your best option if you don’t want to utilize the Finder comments. In fact, why don’t you set up a smart rule that adds the filename to a custom attribute upon importing?
Or what other idea have people where to store the “provenance” of a document.
You could store a link (to the message in Apple Mail) in the “URL” field and optionally add the original filename into the “Alias” field of the “Info” pane.
Here is a SmartRule/Script combo that automates my proposed procedure:
Attach it to a special group in the inbox to avoid processing every import.
Now when a file is dragged over from Apple Mail and dropped into the group this should happen:
the original name is stored in the alias field
the domain of the sender’s E-mail address is added to the name
the message id of the mail is stored into the URL field
the file/record is moved to the inbox
Now when you want to check for the existence of a file/attachment you can do a specific alias search like you did it with the name. Or you can just use the script by John Gruber to copy the message id of your E-mail to the clipboard and paste that into the search box of DT.
The script ist not elaborated but will hopefully show the basic procedure!
on performSmartRule(theRecords)
tell application id "DNtp"
repeat with theRecord in theRecords
-- brute force hack to get the (spotlight) metadata of the already imported file
set theMetaData to paragraphs of (do shell script ¬
"mdimport -d3 -n -t '" & path of theRecord & "' | awk -F'[;<>\"]' '/kMDItemOriginMessageID|kMDItemOriginSenderHandle/ {gsub(/.*:PR:.*= \"?/,\"\"); gsub(/.*@/,\"\"); print $1}'")
-- quick check type/length
try
set theOriginId to item 1 of theMetaData as number
set theOriginDomain to characters 1 thru -1 of item 2 of theMetaData as text
on error
return
end try
-- get message ID string from Apple Mail
tell application "Mail"
set theSelection to selection
if theSelection is not {} and id of item 1 of theSelection = theOriginId then
set theMessageId to message id of item 1 of theSelection
end if
end tell
-- set name and alias
set aliases of theRecord to name of theRecord
set name of theRecord to name of theRecord & " @" & theOriginDomain
-- set URL only if theMessageId is defined
try
set URL of theRecord to "message://" & "%3c" & theMessageId & "%3e"
end try
end repeat
end tell
end performSmartRule
Since I don’t use wiki links with aliases I don’t know …
You can also check the option “Exclude from Wiki Linking” to further safeguard this approach.
But keeping the original filename might be not necessary at all. If your attachments have the URL field populated a search for this message URL might be sufficient to see if the attachments have already been imported to DT.