Make an Annotation with Links, Notes, Tags v2

Damn. Wish I’d known this before using Highlights to annotate a doctoral thesis and a full-length book for a project I’m working on. Something like this capability should be baked into Devonthink. Thanks, Frederiko & Korm, for a lovely piece of scripting.

Actually, I think you might enhance your workflow by using a Highlights-created text as a reference file, as you’re deciding which parts of your annotated PDFs you want to tag w/ this script. Just a thought… That’s how I plan to use both of them together!

That’s what I’ve done in retrospect :slight_smile:

Just wondering if there’s a way to de-select the default checked box for “Set annotation date”? I’m not sure I understand why it’s the default option… I guess that if one were to instantly create tags for a document the moment that it comes out it would be useful – is that the point? In my case, I often annotate documents that are sometimes a bit dated, so that feature doesn’t benefit me. Unless I’m missing some point, which is entirely possible!

Also, is there another, easier way to select multiple tags under the main Annotations tags? Like you @Frederiko, I also have a “followup” tag (as you have in your illustrated example) along w/ a few other more specific research-related tags, so…I would love to figure out how to include these various tags, apart from having to write out each tag name, separated by a semi-colon (IF possible!).

Would still love any kind of input from my last post, if that’s possible! :smiley: Many thanks…

Smart groups or advanced search, but maybe I am not understanding the questions properly.

In my workflow the dates of document are crucial, and I want my annotations to have the same dates as the documents that are referred to. That way a smart group for a particular time period also shows the annotations for the same time period.

Its easy to change. Use “Show package contents” on the script file to open the directory showing the files in the script bundle. Find the file called “main.scpt”, open it in Script Editor and change the line “adjustDate.default = 1” to “adjustDate.default = 0”

You can also change the name of the sub group tags by editing these lines:

property groupTag1 : "Issues"
property groupTag2 : "People"

No its not possible. Autocomplete only works on the first tag. The only reasonable alternative way would be to have several tag fields in the same way as sub group tags are handled.

For various reasons, including the type of documents that I work with and I was finding that that my annotations lacked sufficient context when I returned to them, I have moved to almost exclusively using the version of the script that clips a whole page or more from the pdf. I can then markup the clipped page with the highlight and pdf annotation tools. That way I see the annotated passage in context and the full document is a one click away. Its definitely a more useful approach if you are working with documents, such as historical manuscripts or diagrams, where ocr is not very useful and where one or two lines of description do not fully convey the complexity of the annotation.


So basically, I was just wondering how best to run an advanced search – one that would be based on just select tags (to find files based on convergence of select People / Issues). Make sense?

I hear you, and I totally appreciate that. For me, I typically save (or convert) my files as PDF so that I can annotate them. So, the date of the original document (i.e., it’s publication date) doesn’t necessarily match the date the document was converted / saved as a PDF…and as wonderful as your script is, I don’t think it would be able to make that distinction, correct?

Anyway, I thank you for your suggested fixes! That’s really helpful. I assume that there are particular reasons why the dates of document are crucial to your workflow…

That’s also super useful… Is there any easy way to permanently keep the “None” tag from re-emerging?

Gotcha. That would be useful, but I assume that would mean overhauling the script to accommodate more tag fields, right…? I’m not suggesting that – just wondering.

I’m not sure which version of the script this refers to…and if it suggests I’ve been using the wrong version! I hope not. I think the version I’m using comes from this part of the post:

[url]Put up Example page]

Are you referring to some other script? If so, does this other script you’re referring to still comport with the ‘one thought - one notecard’ approach (as Bill de Ville described it? I ask because it seems like you’re suggesting a move from a more granular tagging system (i.e., the one here) to one that includes using large chunks of text (in order to provide a fuller context). Is that correct?

i’ve just tried using this script and like it very much. i’m wondering if there’s a way to fix the way that the script captures quotes. i work as a reporter and have a fast-pace work environment, so it’s a bit of work to change every quote and apostrophe. thx.

Which script – this thread is so old and hoary that it’s not obvious what you’re referring to – sorry :confused:

Em … doesn’t it? My script and Frederiko’s each capture the selected text and inserts it into the annotation file. Could you be specific on what you want?

I am guessing you are referring to the fact that the quotation is not always accurately quoted. This can be a problem especially with pdfs where the underlying text layer is malformed. The script needs to be able to comprehensively clean up underlying text but this can be pretty complex because there are so many possible permutations. Some day I might tackle it.

Perhaps if you post a screenshot of some of the original text and the generated quotation text I can see if there is an easy fix. No promises.


thx v. much. just so you know, it seems that whenever i use the script apostrophes, quotes, ellipses and sometimes m-dashes always seem to get mangled. but it happens w/ every pdf doc i use, and i read disparate articles w/ different formatting - in other words, the script is performing this way across the board w/ quotes from every pdf.

i’m sorry to trouble people about this, because i’m sure it takes much time and effort to create such a script. but the thing is, i heavily rely on that formatting - esp. quotes & apostrophes - so it takes a great deal of time to clean it up, and i’m often facing daily deadlines so the clean up impedes my delivery-deadline. i’m attaching a screenshot, as you requested. thx for any help & solution you can provide.

Sometimes it is not easy, and often it is not successful, to fix these things in a script.

Over here, I used TextSoap – which has a service that will accept the select text and scrub it of non-Ascii characters etc, and fix quotes.

is that the approach you’d recommend? or maybe, since the syntax-errors seems fairly consistent, it’s possible to first see if can be fixed in a script, and then revert to that suggested app. a good approach? thx.

Unless there is a reasonable guarantee the errors or the source of the errors won’t change, this approach can lead to endless patching (which also can lead to things breaking later). Something to be mindful of.

I’ll second the implication of korm’s and BLUEFROG’s comments.

A lot of time and effort would likely be saved by using Text Soap or similar utility to clean up non-ASCII characters than by trying to do that by scripting.

Appearances are deceiving when it comes to programmatic character fixing. Besides – why ask someone to volunteer to write new code when there’s already software in the market that does what you want?

i see. thx. that’s why i asked. like i said, there seems to be a fairly consistent pattern to the errors, which is why i thought it might be relatively easy to fix at the script level. i tried the textsoap app. i might need to learn a bit more about how to make it work better as it seems a bit clunky; i just need to better understand how to train it to learn the particular, repeated errors and correct them as part of my profile. the point is, w/o a script fix, users would have to: copy & paste the text in text soap, run the app, then copy & paste the corrected text back into the tag - doc. is that right? thx again for the help.

as a relatively new devonthink user, & one who’s just started using this script in this forum, i’ve been wondering the very same thing. any good advice? thx.

My general rule of thumb for databases is to create new databases for broad subject areas. Let that database grow. Split the database when the subject matter material in the database seems to be diverging into other broad subject areas. Prune the database of irrelevant material (e.g., smart groups can help you get rid of aged material).

It’s hard (and valueless) for anyone to give more specific advice because none of use knows your data and what you are planning to do with the documents. I.e., no one can tell you to merge or split databases – you have to know your stuff and figure out the best course of action. Other than a relatively small amount of time (in comparison to the hours you’ll be using the database) splitting/merging is no big deal.

i see. thx. that’s very helpful. i was thinking of creating sub-categories w/in the “issues” section of the script, just to create a bit more separation and order for subject areas. i understand you don’t know my data / documents. like the jprint714 user, i’m using both for a combination of small projects, and one big one that overlaps in some areas. so, i’m wondering if the creation of sub-categories approach is a good one, or if there’s another org strategy i ought to consider. thx.

Thanks Korm and Frederiko these scripts are really useful :smiley: :smiley: