I just stumbled upon a problem with DevonThink Pro Office 2.0.3:
If I load PDF files from the web with Safari and then save them on the disk (with “save” or “save as …”), they usually contain information about the document (title, author, etc.) and also the weblink where they were found.
My guess is “Weitere Informationen” isn’t displayed (actually, as “–”) for the file in DT because it’s under …/Files.noindex, which isn’t indexed by Spotlight. If you copy the file somewhere outside that folder (e.g. Desktop) then the metadata should reappear.
Also, if you select the document in DT and run Tools > Show Info… (Shift-Command-I) the metadata should show up under Additional Information. And in the Tools > Show Properties… (Option-Shift-Command-P) panel.
Notice that “Erstelit” and “Geandert” in your images are identical for both the original and imported files, which is a clue DT didn’t modify the latter when importing.
when I drag the file from DTPro to the Finder or export it, the Finder-meta-data are back visible as they were before
and most of the metadata is also visible in DT Pro (see screenshot - I overlooked it, sorry!)
But in Devonthink I found no way to display or search for the source URL, which I would find important for scientific work where you have to correctly cite your references (with URL and date of last access).
I’m not sure if there’s a way to import/save PDFs into DT so the URL field will be populated. An example comparison:
When saving a Web Archive from Safari to DT’s global inbox the URL will (conveniently) be added to the URL field, but not (unfortunately) when saving a PDF. Both have kMDItemWhereFroms (“Where from”) metadata, but that’s not visible from within DT because it lacks Spotlight indexing. If you drag those documents out of DT then you can see “Where from” metadata in Finder Get Info windows.
I see the pdf in finder, the information is there.
I import it in DTPro and the information is not visible any more (nor in DTPro nor in the Finder, if I look at the file in the database package)
I drag or export the same file to finder, and the information is visible again
-> so where was it stored as long as the file only existed in the DTPro database and can’t it be accessed from there?
(if it would be accessible via AppleScript, one should be able to search all pdf files for this source url information and - if existent - write it to a place where DTPro can use it…)
What exactly does it mean “DT Pro lacks spotlight indexing”?
After re-reading your post several times, the following explanation came to my mind (but I don’t find it convincing yet):
The meta data seem to be (and stay) in the pdf file, but as long as it “lives” in the DevonThink database package, the finder resp. spotlight don’t access it?!
(and DevonThink unfortunately has no access to it either?)
But after reading about the KMDItemWhereFroms, I tried the terminal command mdls on the indexed and the imported pdf file and the imported one does not contain all the meta information:
DevonThink Pro takes the meta information from the file during import and stores it somewhere else.
Why can’t it just leave it where it is? (maybe a stupid question from a non-programmer)
Sorry, I didn’t mean to imply that. I meant this seems tougher for both of us because of my inability to communicate in your native language. I’m slow with German… in and out.
That’s a side effect of it being stored under …/Files.noindex/…, which isn’t indexed by Spotlight.
Copy that file out of the db package and the results of mdls will be identical as with the original (non-imported) file.
DT doesn’t modify the file when importing. While stored in DT a subset of metadata shows up in Information and Document Properties windows, presumably only what can be obtained directly from the file (which excludes Spotlight metadata like KMDItemWhereFroms).
Another question might be:
Which doesn’t DT make use of certain Spotlight metadata for DT metadata during importing, e.g. grab KMDItemWhereFroms (if available) for populating the document URL field?
I’m mostly trying to describe why certain metadata is/isn’t available/visible in certain contexts. I don’t fully understand how DT and Spotlight handle metadata so some explanations might be inaccurate/incomplete.
Is it Safari adding it or a Spotlight/system process? I’m guessing the latter, and that if Spotlight is disabled then the com.apple.metadata:kMDItemFinderComment extended attribute won’t be added to a file when its Spotlight Comments are updated.
Using com.apple.metadata:kMDItemWhereFroms? It can contain multiple (two?) URLs; the first would be what I’d want to populate a DT document’s URL field, if that’s what 2.0.4 will support.