DT3 JSON Parser

Bernardo_V · July 8, 2019, 6:34pm

Theoretically, could I take any JSON string and teach DT3 how to parse it via Applescript? At least, that is what seems to be happening on the smart-rule script ‘download pdf metadata’.

I am curious (i.) if it could be made to parse other json strings not necessarily formatted like the one downloaded from the DOI link and (ii.) if it could be made to parse BibTeX entries.

BLUEFROG · July 8, 2019, 6:51pm

It’s not a matter of “teaching it to parse” anything. The JSON either has elements or it doesn’t. If you have valid JSON data returned, you should be able to access elements from it (though that doesn’t always mean it will be a simple scrape).

Here is a simple example from an API I have never used…

tell application id "DNtp"
	set r to download JSON from "https://www.metaweather.com/api/location/44418/"
	title of r
end tell
--> "London"

Bernardo_V · July 8, 2019, 8:33pm

Thanks, Jim @BLUEFROG. I assume that for the download JSON feature to work you would need an API?

Is there any other way to get bibliographical metadata into DT3? Can it be made to read/parse/whatnot bibtex entries?

rkaplan · July 8, 2019, 8:43pm

Generalizing this a bit - is there any way, through either a script or a built-in feature, to import CSV data into DT3? Just about any bibliographic program (and many many other forms of data) can be exported by CSV, so this would be a really nice script or feature to have.

BLUEFROG · July 8, 2019, 10:09pm

I assume that for the download JSON feature to work you would need an API?

That would be a reason assumption to make since it’s utilizing a URL as a parameter.

is there any way, through either a script or a built-in feature, to import CSV data into DT3?

You can import CSV files into DEVONthink already. They import as sheets.

Can it be made to read/parse/whatnot bibtex entries?

No, it doesn’t natively parse the data. This would be up to the individual to code.
However, depending on the requirements, this could be a trivial thing to do… or a very difficult one.

I would actually say this is a perfect instance when you should ignore what you think should happen or what you think a computer should do. For example, you should ignore column headings in scripting. They function as context for a person, not so much for a script.

Bernardo_V · July 8, 2019, 10:53pm

Let us say that I have an entry with the following:

@book{2000nicomachean,
  title={Nicomachean Ethics},
  author={Aristotle},
  translator={Terrence Irwin}
  isbn={9781603845687},
  series={},
  url={https://books.google.com.br/books?id=-Mf5XV8q6CgC},
  year={2000},
  publisher={Hackett Publishing Company},
  place={Indianapolis}
}

What I need is a way to get these values into the proper variables.
Hence:

set the_author to author (from the BibTeX entry)
set the_title to title (from the BibTeX entry)
and so on

Can you think of a way to do this with Applescript? I thought of using RegEx at first, but I still lack the proper skills to do it.

BLUEFROG · July 9, 2019, 5:58am

That’s neither CSV nor JSON data.
Is that actual output?

cgrunenberg · July 9, 2019, 7:52am

DEVONthink can actually import *.bib files which are displayed as read-only sheets (supports both table & form view). Therefore a clumsy workaround might be to import such a file and retrieve the necessary data via AppleScript from the sheet and to remove the item again from the database.

cgrunenberg · July 9, 2019, 7:53am

That’s definitely doable, in case of online resources I’d suggest to use the “download JSON from” command. Otherwise it’s a little bit more complicated and requires the foundation, see e.g. Parsing JSON files - AppleScript | Mac OS X - MacScripter

rkaplan · July 9, 2019, 9:51am

The Bibtext import is very interesting. Is there a way for it not to be read-only? Among other reasons, I cannot edit a column to be a URL column and thus activate hyperlinks.

And again- sorry to repeat this over and over - we really need a “wrap text” feature because it is impossible to read a long text field such as an abstract. Since this is intended for bibliographic entries, that truly is an essential feature.

cgrunenberg · July 9, 2019, 9:53am

No, DEVONthink doesn’t support editing or exporting of *.bib files.

rkaplan · July 9, 2019, 10:05am

Can it simply be converted to a regular Sheet?

cgrunenberg · July 9, 2019, 10:18am

Select all rows of the sheet (Cmd-A doesn’t seem to work due to a bug), copy them to the clipboard and press Cmd-N. This should create a new sheet.

rkaplan · July 9, 2019, 10:42am

Thanks that works.

Also FWIW I realized that it is possible to open a link in a “non URL” column in read-only Bibtext using Alfred - if installed, it recognizes the text as a link and presents the option to open it:

Bernardo_V · July 9, 2019, 1:27pm

I am not sure how to reply. Indeed it is neither CSV nor JSON, it is LaTeX. All BibTeX files that you’ll come across on sites like JSTOR, Google Scholar, Google Books and so on are just plain text files written in this way. It its remarkably similar to JSON which makes me think it shouldn’t be too hard to do the same thing with it as what it is already doing with the JSON metadata files.

Could you perhaps outline the steps that would be involved in this? It would help me understand how to attempt it.

jongilizwe · July 9, 2019, 1:58pm

Having recently gone through this process moving BibTex into FileMaker, I’d advise you not to reinvent the wheel. There are lots of BibTex parsers in different languages out there, with mature codebases (10+ years), but by far the easiest route is to use BibDesk and create a csv export template, then parse that.

BLUEFROG · July 9, 2019, 2:19pm

It its remarkably similar to JSON which makes me think it shouldn’t be too hard to do the same thing with it as what it is already doing with the JSON metadata files.

True, and here’s a caution:
Don’t let that distract you from the fact that it’s not JSON data. Parsing JSON - or any format - is a specific thing. So a parser isn’t trained (unless you’re writing your own ecumenical parsing engine). It just parses what it is written to parse.

BLUEFROG · July 9, 2019, 2:21pm

Depending on what you’re trying to accomplish, this may or may not be the case. Also, parsing CSV can be less intuitive than it would appear since it isn’t constructed in a manner that conforms to what we see and how we think.

cgrunenberg · July 9, 2019, 2:23pm

Both in case of a *.bib file or a CSV file exported e.g. by BibDesk the necessary steps would look this:

tell application id "DNtp"
	set theFile to choose file of type {"bib","csv"}
	set theItem to import (POSIX path of theFile) to current group
	set theColumns to columns of theItem
	set theCells to cells of theItem
	set theAuthorColumn to my list_position("author", theColumns)
	repeat with theRow in theCells
		if (count of theRow) ≥ theAuthorColumn then
			set theAuthor to item theAuthorColumn of theRow
			display dialog theAuthor
		end if
	end repeat
	delete record theItem
end tell

on list_position(this_item, this_list)
	repeat with i from 1 to the count of this_list
		if item i of this_list is this_item then return i
	end repeat
	return 0
end list_position

jongilizwe · July 9, 2019, 6:35pm

True. I haven’t experimented with csv import into DT3 as I’m happy with the combo of BibDesk on Mac and Filemaker Go on iOS for the time being.

My point though was that it will almost certainly be easier to use an existing parser to get the data into a form DT3 can read natively than to try to modify a json parser to work with BibTex.