Is the URL field of a document DEVONthink-proprietary metadata?

Sometimes I create PDFs from websites outside of DEVONthink and then import them into DEVONthink. I then afterwards set in DEVONthink the URL accordingly.

Is it possible to set the URL field of the PDF before I import it (i.e. outside of DEVONthink) - maybe in a script? Or is the URL field of a document DEVONthink-proprietary metadata?

Where? There is no “URL” in the standardized PDF metadata structure.

I was fearing that. However, some PDFs, I think, when I import them as files into DEVONthink, do have DEVONthink’s URL field set – or am I mistaken?

I suppose that those are PDFs you created with the Sorter or some other DT function. A smart group should tell you more (search for "URL is not " (empty string) and “Kind is PS/PDF”). Here, I see for example an URL like
https://banking.ing-diba.de/app/sPI7b/c?x=sPIf_I9kUy0fMSE8cDFKgnjIb-KS6oCcR5-Oh1znlte5wX_EQJQ37lVVOL5SYRh39Oi4npai54OW4Isecx8DaVbIntFTKH-S44lSli1Qzi5XFVrZ-bbA0K-N39uIM5dEc
and that is also useless, as the URL was probably generated on the fly and no longer valid.
Even weirder is
blob:https://apollo.arbeitsagentur.de/40b11bfb-9c67-4c01-87d0-15fa2abb1443
All these most likely come from me using the sorter when the PDF was displayed in the browser, or printing the PDF to DT from the browser.

No, I’m pretty sure the PDFs were created outside DEVONthink.

Here an example of a PDF that is created outside DEVONthink and it sores the source URL somewheer in the file in a way that DEVONthink then reads it and puts it in its URL field:

FLYCE122-FLY12-CE-Instruction-Manual-EN.pdf (2.1 MB)

I “created” this document by downloading in Vivaldi the URL https://cycliq.com/files/manuals/FLYCE122-FLY12-CE-Instruction-Manual-EN.pdf

Where does this document stores the URL, so that the URL shows up magically in DEVONthink – and how can I add this field outside of DEVONthink via a script or similar myself?

Well, not all document URLs are like this. See the example above.

It’s probably just the extended attribute added by e.g. Safari to downloads. Therefore over here the URL of the downloaded PDF is just the above link.

2 Likes

As @cgrunenberg said: In an extended attribute in the file system. Easy to see: If you save the PDF (the first link in your post above) to DT (I did so with Firefox, but Safari should behave similarly), the URL will show https://discourse.devontechnologies…, not https://cycliq.com.
Apple’s file system supports a lot of extended attributes, and those vanish into thin air as soon as you move the file somewhere else (I doubt that most of those attributes even survive iCloud sync to an iOS device).

You can drag the PDF to your desktop and then run
mdls ~/Desktop/FLYCE…
in Terminal. You’ll see something like

kMDItemWhereFroms                  = (
    "https://devontech-discourse.s3.dualstack.us-east-1.amazonaws.com/uploads/original/3X/7/8/783fb45f290619ffd49f669914dd07f9830e87fb.pdf"
)

and with your original PDF, you should see the original URL.

1 Like

For more on extended attributes (xattr), Howard Oakley has written a bunch about them at Extended attributes (xattrs) – The Eclectic Light Company – he also makes some software to work with them.

Some of them do persist in iCloud, as far as I recall, but not other file systems.

2 Likes