Fetching DOI-based metadata for my large PDF library

I’ve read every thread that’s related to this topic but haven’t been able to implement the right steps to solve my problem.

Can someone direct me to script that will fetch metadata into DT3 based on the DOI? All of the files I’m looking to re-label contain DOIs. I’m looking for something similar to what the “Google Books Metadata” script does – from whatever source.

Alternatively, if I do have to deal with code/ github, I’d really appreciate clear instructions on what to do.

Many of the discussions on this forum seem to assume some degree of proficiency in coding. I downloaded Python and followed a few set of instructions that caused systemic issues on my Mac mini (M1, Ventura). Perhaps this is a shortcoming on my part. I’ve since restored the Mac’s factory settings.

Thanks in advance.

2 Likes

DEVONthink 3 includes such a script for smart rules, see action Execute Script > External > Download Bibliographic Metadata.

I appreciate it. I didn’t know there were additional scripts that could be accessed like this.

I tried running the action on a number of PDFs without any results, however. I tried to manually enter the DOI number in its dedicated section to see if the rest will change, too.

I chose “if kind is PDF” as the action’s predicate – to make sure the condition is met.

What am I doing wrong?

A screenshot of the rule would be useful.

Screenshot 2023-03-31 at 6.30.47 PM
Here you go

Whatever shortcomings there may be on your part, I can assure you there are also shortcomings wrt to Python on Mac

2 Likes

A sample PDF you expect to work would be nice.

10.2307@162225.pdf (798.1 KB)
Here’s a sample

You know what? I tried running the script on new journal articles, where the DOI is mentioned inside the PDF and it worked.

I guess the outstanding problem is how to make this script (or another?) work for articles whose DOI information is available in their title or metadata but not in the PDF.

Such a sample would be useful too, thanks.

Here’s a document where DT3’s script worked
What Does It Mean To Be an Arab Leftist Today- copy.pdf (60.7 KB)

And this is an example of document where it doesn’t
An Assessment of the Trade and Restructuring Effects of the Gulf Co-operation Council (International Journal Middle East Studies, vol. 21, issue 1) (1989) copy.pdf (277.3 KB)

I don’t think the DOI being mentioned is a factor after all.

Note there is not going to be a 100% solution. You are reliant on the site’s records and if they have no match for a particular file, there will be no results. Just something to keep in mind.

Fair enough. I’ll take whatever I can get, in terms of automation.

I finally managed to run the automation from this thread too (after installing a bunch of things I don’t understand!) but it doesn’t seem to be an improvement on the script that comes with DT3.

There’s no DOI in the second document as far as I can tell, therefore the script can’t work.

Not in the file itself, but it’s in the file’s title in my database. I attempted two solutions: a) manually inputting the DOI in the custom metadata field and b) printing the DOI number in the PDF. Neither worked.

The first document was originally released with a DOI, whereas the second was assigned a DOI by the journal retrospectively. I notice the same pattern with other academic journals: older articles tend not to be identified in DT3.

Yet, Zotero and Mendely are able to identify such older articles without any problems. This leads me to wonder the bottleneck might be DT3’s script.

You guys are doing a fantastic job and I don’t want this to come across as a complaint or a demand. But it’d serve us academics well if this can be addressed somehow. :slight_smile:

1 Like

Did you add the DOI to the filename? Just wondering as custom metadata might make more sense.

The second PDF doesn’t have a known DOI and is not recording one in Zotero. It looks like it’s likely querying JSTOR as it records a stable URL for it.

Strange. I actually found both of these files through their DOIs. I ran the numbers through Zotero, downloaded the files that have been identified there and then exported them.

I did both: I put the number in the file name, and then the DOI custom metadata field. Figured there’s probably a script out there that’ll automatically transfer the file name to that particular field.

i imported the PDF directly into Zotero. That’s where my results came from.