File format for import

Harry_B · July 16, 2023, 12:59pm

I am looking for a way to import files I can generate myself (data coming from another system), like json files, where I can pass along as much information as possible, like

creation date/time
tags
geo location
and these notes also contain media files, like references to images and videos.

I would like to create these files myself and then throw all of them into DT at once.

Any idea how this could be done, which file format would work for this purpose and is this documented somewhere?

chrillek · July 16, 2023, 1:15pm

This question is so general, that I feel tempted to reply with “yes, is possible”. More so since DT imports whatever you want it to import. Also JSON. Another quotation entirely is if you can put those data to use in DT, ie If it’s searchable.

If you’re thinking about semi-structured data like JSON, why not use a No-SQL database with it?

Harry_B · July 16, 2023, 3:14pm

The reason why I ask is, because I don’t see any telling options in the import menu and which kind of data structures are expected as for note-creation-timestamp, geolocation and so on.

Am I missing some part of the documentation somewhere?

chrillek · July 16, 2023, 3:18pm

Markdown allows for metadata in the frontmatter (cf documentation for MultiMarkdown)

BLUEFROG · July 16, 2023, 3:34pm

Why are you wanting to import JSON files? That’s not a format for general use.

Harry_B · July 16, 2023, 4:11pm

Maybe I didn’t explain well enough. I want to create RTF notes automatically, but want to bring along additional properties and media I mentioned above.
The content of the note is not JSON, it will contain just plain text, prefrerrably formatted with headlines and text blocks.

Like this example:
{
“title”: “my title”,
“content”: “here comes content, maybe with formats?”,
“geo-location”: " 52° 30‘ 58.32 N 13° 22‘ 39.72 E",
“tags”: [ “tag1”, “tag2”]
“date created”: “2021-03-14Z12:34:14”,
“date updated”: “2023-07-16Z18:14:34”
}

BLUEFROG · July 16, 2023, 4:26pm

Using JSON as a vector for creating rich text files is a non-standard vector and there’s no direct support for it. (It’s actually the first time I can recall anyone even mentioning it.)

It would be possible via scripting but again, there’s nothing built-in for this.

However, it also feels like killing a fly with a shotgun. I’d suggest you read the Help > Tutorials > Using Templates tutorial, as creating and using templates is a fairly simple affair.

chrillek · July 16, 2023, 4:28pm

To use that as you described in DT, you’ll have to write a script. It must extract the metadata, create an RTF record, set its content and the metadata. Feasible, but may be not practical.

Harry_B · July 16, 2023, 4:33pm

I’d would have to create several 1000 notes this way, so I was thinking along the lines like handling it like an enex-import would handle this, where it finishes importing within a very short amount of time.
Doing this via scripting sounds more like a very slow and cumbersome way of doing this.

BLUEFROG · July 16, 2023, 4:43pm

Why? Please clarify what you’re actually doing and what ”other system” you’re referring to in your inital post.

chrillek · July 16, 2023, 4:46pm

I already said that MultiMarkdown allows to include metadata. Why not use that then?

Harry_B · July 16, 2023, 5:33pm

I am looking for a way to archive data from an old forum in a way I can still access its content properly.
I have a huge json file containing an entry per post and each entry has the above mentioned meta data and also contains photos.

I can develop a parser to cut the big json structure into small pieces and ofc also transform the format.

BLUEFROG · July 17, 2023, 12:45am

What would be the resulting file format in DEVONthink?

Harry_B · July 17, 2023, 7:32am

I would like to create one RTF note per thread. The note can contain media files attached though.

rmschne · July 17, 2023, 7:57am

Python has extensive capabilities of parsing JSON format data. Perhaps you, or someone you hire, could make some code to do this. From first glance it does not seem too difficult since you know exactly what you want. Just an idea.

Harry_B · July 17, 2023, 8:05am

I am not struggling parsing the data I have, I don’t know which data format is required to import that data in bulk.

rmschne · July 17, 2023, 8:14am

You can import pretty much any data into DEVONthink. But DEVONthink does not have unlimited capabilities to present (Preview) the file’s content. Also (probably) does not have unlimited capabilities to index files that the app does not know how to Preview. DEVONthink has built-in capabilities, and the app relies on macOS services. The DEVONthink documentation, and other experts can comment, of course, on this.

If you can parse the data, great first step. Your approach to create RTF files seems ok as you say you have attachments and RTF can handle attachments and previewable in DEVONthink. So, do some trials and see what happens. Parse a few records, some with attachments, into RFT and then import.

Harry_B · July 17, 2023, 8:32am

After looking into the topic in more detail and the discussion here, I think the easiest way is to convert the data to enex and import that. That would probably be the easiest way to get the bulk import going without much hassle.

tja · July 23, 2023, 9:06am

ENEX is just XML, right?

JSON is a more modern file format, so I don’t really understand what this should give to you.

For using DT / DTTG you need other formats anyway, like Text, Markdown, RTF or PDF.
This means, you need to extract this content anyway!

Converting from JSON to ENEX is unnecessary, I believe.

I just did something similar with JSON exports from the Drafts App, to get Markdow files from this - I needed “miller” (mlr) and “jq” to accomplish the conversion from JSON to Markdown in a bash script, but it worked great!

chrillek · July 23, 2023, 9:56am

And the same could easily be accomplished with JavaScript and osascript on the command line.