I find myself wanting to see Devonthink allow an automatic extraction of the DOIs inside the PDF file and to convert these DOIs to Devonthink metatadata. Mainly the PDF Title as well as the PDF Authors. The Highlights App picks those up, but the only thing it exports to Devonthink is an HTML file with the same title as the PDF file. This doesn’t solve my wish to have the ‘Title’ field in Devonthink populated with the actual Title of the article. Furthermore inside the Devonthink Database, the ‘Name’ field has nothing to do with the file name … and can be populated. Yet there is no way to show as one of the columns the actual ‘file name’ other than seeing it as a component of the ‘Path’. I’ve added all ‘Colums’ and there is no ‘Horizontal Scrollbar’ which means I have to widen the window beyond the screen to see all columns… none of them is the actual file name (other than as component of ‘Path’).
So I find myself wondering how anyone has managed to get a simple workflow of doing basic PDF file management… such as with Title and Authors within the Devonthink metadata database without doing this manually every time a new PDF is added to the database…
I’d use a dedicated bibliography manager for the data extraction, then a script or folder action to import the data into DT. Also, last time I checked there were a few Alfred workflows for DOI-based searching.
I’m able to get Highlights to generate an HTML file.
The best I’ve done so far is to get the HTML file generated by HilightsApp and then merge the HTML file and the PDF file into one document. This creates an entry with a clearly searchable title but it doesn’t populate the Title field in the DT Database.
The other thing I want to do is to do something like:
Read 1st line of PDF, dump it into the Title field.
Read 2nd line of PDF, dump it into the Authors field.
If there was a way to open PDF, read first line, dump into Title field, via a script… then I think this would work. The kind of stuff I find myself reading the most is papers from OpticsInfoBase and those articles have always the Title as the first line of the content and the authors as the 2nd line in the PDF.
These two words lead to problems. always makes it easier to script.
Also, just as an FYI - dealing with PDFs under-the-hood is a challenge in itself. It’s not that it can’t be done, but that you will find many custom tools for getting at the underlying data (which also means they’re not going to be installed on your Mac be default).
It should be possible to edit the Title field for PDFs in DEVONthink-Tools Menu>Show Properties. The 7 property metadata fields that are editable varies (from none to most) based document type, but Title is one of the editable fields for PDFs.
True. In both View as Icons and View as List, the Properties panel neither shows the metadata for a PDF, nor allows that data to be edited. The other Views seem to be OK. Didn’t test in the case of non-PDF documents.