Bibdesk keywords => MacOS tags ≠> Devonthink tags

kidwellj · January 31, 2016, 8:44am

I’m trying to troubleshoot a puzzling issue with devonthink indexing tagged PDF files. I index a large batch of PDF files which have been named and organised using BibDesk. These files are saved to a common location in a Dropbox share (this is the standard bibdesk routine) but come from a variety of different bibtex bibliography files managed by this application which contain a range of data about each citation and which is associated with a specific PDF. I use an applescript to save keywords which have been assigned to these books / articles etc. to the PDF files as macos metadata (“BibDesk-MavericksTags” at github.com/derickfay/BibDesk-MavericksTags.git). The script runs the following line to encode tags:


do shell script "xattr -w com.apple.metadata:_kMDItemUserTags '" & plistTagString

When I open files in macos finder tags show up in the listing. However, when I index or import files into devonthink, no tags appear. I have found a fix for the issue, which is to open each PDF with a get info dialogue in the finder, and then add a new (unique) tag. Once I update in devonthink, it immediately recognises the new tags. I also noticed, strangely, that when I add a tag in devonthink, it gets placed in a slightly different container:

jkidwell$ diff wolfe_before.txt wolfe_after.txt
37a38,41
> kMDItemOMUserTags              = (
>     "animal_studies"
> )
> kMDItemOMUserTagTime           = 2016-01-30 21:58:33 +0000
44d47
<     "animal_studies",
47a51,53
> kOMUserTags                    = (
>     "animal_studies"
> )

(in above, I deleted the tag in devonthink and then added it back)

At first, I thought that the BibDesk-MavericksTags application might have been embedding tags in a strange way (as above), but mdls for before and after my get info finder fix shows that absolutely nothing has changed in metadata from before / after tagging (except for the addition of a new tag):

Before:

After:

jkidwell$ diff rabinow_before.txt rabinow_after2.txt
49c49,50
<     ethnography
---
>     ethnography,
>     ethnography2

This leaves me believing that perhaps there is a database in macos that needs to be rebuild in order for these tags to function in devonthink. I’d be very happy for any suggestions here, particularly if I should modify the bibdesk script slightly to store tag metadata in a different field?

Oh and, yes, I’ve left a very long time (a week) after adding tags to make sure macos finishes reading them, and have also unchecked “exclude groups from tagging”

Thanks for the help!

blove · April 18, 2016, 4:55pm

I’m sorry that I cannot provide any help, but I would like to learn more about your workflow. I am attempting to use DTPO and BibDesk and LaTeX to write. I’ve used DTPO for personal paper management, but have not invested in it heavily for professional writing. I’m now to the point of collecting so many research articles, web snippets, book snippets, etc that I need to put them time into figuring out a workflow that depends on DTPO.

Could you share an overview of your workflow?

Thanks much.

nat · May 17, 2016, 8:49pm

I have used BibDesk for almost a decade (my master BibTeX file has over 20,000 entries). I have just started experimenting with DEVONthink in the past few days. One of my priorities was to figure out how to integrate BibDesk with DEVONthink, and I explained how I have done that in a separate thread titled How to index individual BibDesk entries in DEVONthink.

Workflow will be different for each person, and of course should follow whatever may be standard in your field of work. This is my workflow, approximately:

As a general-purpose journal and note card file and GTD-style project management center, I used to use Journler, but since development of that program has been abandoned for as many years as I’ve been using it, I am now experimenting with DEVONthink as a replacement, and it looks very promising. It was fairly easy to export my management system from Journler and import it into DEVONthink, with a few intermediate modifications of all the files using BBEdit.

I do my long-form writing (anything longer than a blog post) in Scrivener, using MultiMarkdown-flavored Markdown (or Pandoc-flavored Markdown, which has more features). I use Scrivener’s Compile command to export to single Markdown file. Then I use Pandoc to convert the Markdown file to whatever format is needed for the project: usually LaTeX, or Microsoft Word format for import into Adobe InDesign.

Citations are handled with cite keys from BibDesk. I have one master BibTeX file that contains most of the references I’ve ever cited (and more), with all the citations for particular projects organized into corresponding folders in BibDesk (although for some projects I will also save the references for those projects in separate BibTeX files as well).

Any references that I can acquire from library catalogs are imported into BibDesk through BibDesk’s Search Groups; most other references are scraped from the web using Zotero and automatically shunted to BibDesk via Zot2Bib.

I read and annotate PDF files using Skim. Although of course it is possible to link PDF files to BibDesk entries, I don’t link PDF files to my BibDesk entries because when I started using BibDesk ten years ago this would slow BibDesk’s search speed to an unacceptably slow speed on my computer, so I just file away the PDF files in the Finder and put all the relevant quotes from each reference in the Abstract field of the corresponding entry in BibDesk.