Automate Indexing of a Zotero *Collection*?

All my PDFs are indexed/linked in both DT3 and my Zotero library–the latter is where they get initially imported, then relocated/renamed by Zotfile, and then indexed in DT3–so any change to a PDF shows up in both apps.

My question is: Is there a way for an automated process in DT3 to Index a Zotero library Collection?

If you index a Finder folder, the changes made in that folder are generally updated automatically.

Right, but as far as I know Zotero “Collections” don’t have their own local folder in the way that the entire Zotero PDF “Library” does.

(I could be 100% wrong, but I think “Collections” are just links to the “original” items in the “Library.”)

Someone running Zotero actively would have to chime in on this as I could easily be 101% wrong on something regarding it :wink:

1 Like

A few people have posted about how they use Zotero (see, e.g., here and here, and also the pages here). AFAIK no one has done what you ask, but I have ideas about how it could be done.

First, I assume you are not using linked attachments, or else you wouldn’t be able to index the Zotero storage folder at all. My own setting in Zotero PreferencesAdvancedFiles and Folders looks like this:

In the “data directory location”, Zotero creates a folder named storage, which (as you probably noticed if you’re already indexing it) is a pile of subfolders having names like X34Q57FX.


Unfortunately, at this file system level, Zotero does not distinguish or provide any indication at all of which record a folder is associated with: it’s all one homogeneous single-level pile of folders with attachments (PDFs for most people) within them. It turns out the X34Q57FX folder names are keys that only have meaning to Zotero. This, in turn, means that from the standpoint of indexing in DEVONthink, it’s impossible to provide DEVONthink with any clue about any given folder’s membership in a Zotero collection.

The following are some ideas for possible workarounds. I haven’t implemented any of these myself and don’t know of anyone who has, but based on what I know and have done so far, these seem like they would work:

  1. Use an external (to DEVONthink) process to annotate either the folders themselves, or the files inside the folders (or both). For example, the external process could add Finder-level tags to the PDF files inside the various X34Q57FX folders, and different tags could represent different Zotero group names. This would allow you to index all of storage in DEVONthink as usual, and then use a smart group to provide a view on a subset of items by searching for particular tags.
  2. Provide a way for DEVONthink to ask (somehow) Zotero about the collection membership of any given item indexed in the storage folder. Then you could (e.g.) create a smart rule that triggers on new additions imported in the indexed folder, runs some code to ask about the group membership of that item, and (e.g.) set a custom metadata field storing the group membership name. This would put the Zotero group name in a field on every Zotero PDF file in DEVONthink, and let you easily do things in DEVONthink (like create smart groups) based on the field value.

Option #1 would still require a way to ask Zotero about the group membership of any given item found in storage, so both options require a method to do that. A potential problem with option #1 is that I don’t know whether Zotero’s handling of the storage folder contents will preserve any attributes like tags or other things done to the folders and/or files, or cause problems in Zotero, so there’s chance #1 would be more fragile than option #2. But that’s just speculation.

As to methods to ask Zotero dynamically, I have actually been working on a general-purpose command-line tool to do that, and even have it working internally. I’m using it right now to get the BibTeX citekeys and store them in a custom metadata field. The tool is not ready for public use, but I hope to get it out there soon-ish. Alternatively, if you use Alfred, it might be possible to use ZotHero as part of a workflow that gets the necessary info.

Thanks for your suggestions mhucka! Given my lack of technical knowledge it sounds like Option 1 might be my best bet if I can figure out how to automate the tagging in Finder.

If it matters, all my PDFs are indeed both indexed in DT and linked to citation entries as file attachments in Zotero. I do this (see below) by using the relative base directory and having Zotfile move all the PDFs into a single local folder outside of Zotero’s quirky subfolder system. I also have this single PDF folder synced via cloud across my devices. That way, changes to a PDF (on any device, in any app) are always reflected everywhere (i.e. locally, in DT, and in Zotero). My DT dbs are also all synced across devices with WebDav.

Everything would be perfect for my needs if I could just automate a process in which DT pulls indexed PDFs grouped into a collection Zotero. Because… I’m trying to have all my PDFs indexed in one db and only a smaller selection of them (the Zotero collection) in a separate db. Giving all the PDFs in that smaller selection the same Finder tag would do the trick (albeit, in my case, manually).

The big folder of indexed PDFs in one db is for my work and searching in general; the smaller selection of PDFs (aka the Zotero collection) in the other db is for a specific project.

FWIW, here’s what the setup looks like using the example of the PDF below:

This is the local folder containing all my PDFs. Every PDF, like this one, gets put there via Zotfile after I import the citation and file to Zotero. The local folder is also synced across my devices via a cloud service (which may not be necessary):

The file is linked as attachment to its respective Zotero citation:

Zotero’s relative base directory allows me to access the file across separate computers.

The entire local folder with all my PDFs is indexed in DT.

This is my messy (and probably inefficient) way of manipulating/using a single file across all my devices/apps.

What I would love is for the small selection of PDFs in my Zotero collection named “Read for…” to get indexed into that DT db named “Upstate.” (Sorry! This got super long.)

I’m surprised it’s possible to move files out of Zotero’s folders and have things still work in Zotero.

Anyway, I just tested adding a tag in the Finder for a file that’s indexed, and then removing it, and DEVONthink detected the change both times, so the approach seems feasible.

Is it possible to implement this with the PDFs being stored in Dropbox?

Can you clarify what “this” refers to, exactly? Tagging? Moving things out of Zotero? Indexing in DEVONthink?

I’m coming late to this–and have not tested it yet, but did find an interesting workflow that may do what you are wanting:

1.5.1 Zotfile Preferences

Steps for setting up file organisation if you’re using a cloud to link files into zotero

  1. Go to Zotero → Tools → Zotfile Preferences
  2. In general settings, you can tell zotfile where to store your documents. For this, in “Location of Files”, choose Custom Location, and choose the folder you selected in 1.2 (in my case, “ZoteroAttachments”)
  3. You can choose to have zotfile sub-categorise your files in this folder, using these wildcards. Many people choose to categorise by Author (%a) or year (%y).
  4. I prefer organising my files by subject, so I pick %c, meaning collection. You can also combine different wildcards, like so: “/%c/%a” (Screenshot 07). The wildcards will create folders with the entirety of the field referred to. In my case, it will create a folder with the name of the collection in Storage (/ZoteroAttachments/CollectionName)

If I’m reading step 4 correctly, your PDFs will be stored in folders with the name of the collection you have assigned them to–which could then be indexed in DT.

1 Like

That did it! Thank you so much rpallred!

And the article you linked to has a bunch of other good stuff on Zotero integration that will work with DT3 too.

1 Like