New to DT3. So far so good. Couple of workflow questions

Hello.
So I’m test-driving DT3 and Evernote, and so far I like DT3 much better.
I’m very impressed by its OCR tech! I threw at it some (badly taken) phone-shots of a music book, PDFs with screenshots of text, and it detected it all perfectly> It even recognised some of the complex Figured Bass symbols from the pic of the old book! So now all those docs will be searchable and annotatable :grinning:

There is just something I haven’t been able to figure out about webarchive annotations and searching…
– It seems it doesn’t have the powerful annotation tools like it has for PDFs… All I see is a way to highlight text via a contextual menu or via the bar, but no boxes or circles/ovals that I can then search for.
Where are they?

– If I highlight some text in a webarchive in different sections of the document, HOW can I just see the highlighted parts or the “annotated” parts (even coloured text I add to the webarchive for notes myself). I discovered DT3 can do that with PDFs, but I don’t see a way to do it with webarchives or regular text? The annotations tab for webarchives only show images of the document, but none of my highlighting or annotations or added text.
Can you please point me in the right direction?

By the way, converting the webarchive to PDF is NOT an option for me, since most times they contain music, MIDI, video and I don’t want to lose that, of course.

Do you have any suggestions? What is this newbie missing?

Thank you for your help!

Fernando

DEVONthink doesn’t use proprietary file formats and therefore only PDF documents support these advanced annotation tools.

Both Tools > Inspectors > Documents > Annotations and Tools > Summarize Highlights support currently only highlighted text in rich text documents and PDF documents.

Thank you for your answer, cgrunenberg.

Pity about not being able to equally annotate webarchives with the same tools, and that Summarise Highlights support is only for rich text and PDFs.

Are there any plans to make those two points a feature in the near future?
It seems it’s the last missing link for an awesome app like DT3, ESPECIALLY since we do lots of research on the internet these days, not having this option seems a bit limiting. I have hundreds of webarchives I wish I could treat exactly as PDFs (and as mentioned upthread, converting the webarchives to PDF won’t work for me).

In the mean time, does anyone have a workflow suggestion on how I could gather/summarize/see my webarchive annotations by topic? (I use Diigo for annotating a webpage, and then convert it to webarchive, but this has its set of issues too)

Any ideas are appreciated. Thank you.

Webarchives are not technologically as transparent to inspection. Have considered capturing as HTML instead?

Thank you, BLUEFROG.

I will certainly try it out with HTML instead, as you suggest. So could I do what I want with HTMLs?

I understand that HTML does NOT capture images, videos, music or MIDI, but it “loads” them from the internet every time I open the document, so it’s never really captured. Is this correct?

I’ll definitely try your suggestion, but it worries me that those files wouldn’t be available anymore if the website disappears or ceases to exist (the whole point of webarchives to avoid this)…

So could I do what I want with HTMLs?

Possibly with some scripting. However, there’s nothing built-in at the moment.
HTML is just more immediately parseable to scripters.
Do note Development could also weigh in on this.

I understand that HTML does NOT capture images, videos, music or MIDI, but it “loads” them from the internet every time I open the document, so it’s never really captured. Is this correct?

Actually, that could easily be true of webarchives as well, especially with the way content is delivered on the Internet nowadays.

(the whole point of webarchives to avoid this)…

Webarchives are not a guarantee of this.

Do you have a URL to test?

Sure. Even a very simple website like this, with MIDI, audio and images>

Once it’s inside DT3, there is no difference in how HTML and webarchive look. For the former, it loads the content each time; for the latter, it does NOT play the music examples inside DT3… It ONLY does so if I select from the contextual menu “Reload”. THEN it plays the music fine, ONLY one time… once I select another document, DT3 can’t play the file and I must reload again.

Thank you for telling me that Webarchives are not the panacea either… I thought it was a guarantee that you’d always get ALL the media files in the archive.

So are the HTML annotation features the same as PDF? I found the annotation tab for webarchives, but it seems it only lets me add ONE note for the whole article. Is this so? Thanks!

I found the annotation tab for webarchives, but it seems it only lets me add ONE note for the whole article. Is this so?

Tools > Inspectors > Annotations & Reminders is not the same as highlighting (or annotating PDFs).
It is for creating Annotation files, i.e. separate files for notes regarding a particular file. This is discussed in the Help > Documentation > Inspectors > Annotations & Reminders, as well as often discussed in the forums too.

You may use RTF annotation (see Annotations pane). So that when you copy a working link in your webarchive, you can parse it into this RTF Annotation with a help of a simple script. The result is the structured summary annotation text with the wiki-style links to the webarchive content. You may use this summary separately, create replicants and so on.

IMO this is more handy than drawing annotations and trying summarize them afterwards.