I have a bunch of plain-text news articles from LexisNexis. I have an R script that will let me read them in as data, so then I can export them - and all related metadata like author, etc - into whatever format I want.
I’m trying to figure out the best way to format these files to get them into Devonthink because I greatly want to take advantage DevonThinks file tools to highlight and track whatever I think is interesting.
I’ve only had DevonThink for a few months and I’m not sure where to get started here. I don’t know a lick of AppleScript or JavaScript, but I did see smart rules and you can search for text inside? Could I format the text file with value:key pairs for easy metadata filling?
Any help - links to resources or whatever would be helpful. I have searched the forum some, and all signs really point to me learning AppleScript, but I also want to get started on this project while I learn it -so thank you all in advance.
How exactly should it come from there? You use source to set the “Author” field, but you do nothing (at least not that I can see) to set the company field. DT might provide some AI, but I doubt that it goes this far.
That’s fair, I had the wrong field selected. So I fixed it, and nothing shows in company still after running the rule. I’m curious if I’m getting the concept right, because it’s not working and I’m not sure what I’m doing wrong other than confusing posting.
Thank you for looking, @chrillek. Unfortunately it’s single-line text. I think that’s just a string. I feel like I need to RTFM but I did. I feel like I need to R_a_different_FM. Anyone got a suggestion?
I did that and confirmed it’s “1” when I run defaults read com.devon-technologies.think3 IndexRawMarkdownSource -bool.
It’s still reading the second date in the file as document date. Is this expected behavior? While I’m here - I don’t need to be fancy about markdown, is there a better format for a group of text with metadata into DevonThink?
I am following this thread with interest because (as some here will know) I’ve had fun trying to extract dates from documents to feed into custom metadata.
In the current case would Scan Text > Date: * followed by use of the Document String placeholder achieve what @grantfan needs?
Edit: Ah - but then some manipulation would be required (by script?) to turn that into a date format: bother!