Thanks for the reminder about Nissus: it has matured considerably since I last looked, and I imagine you could use bookmarking and a clever macro to achieve approximately what I want, at least within documents.
To your [very good] question: ‘what does DTPO bring to the table to leverage this kind of work’.
Well. Here’s a scenario in which I think the powers of DT would be tremendously useful, largely because I’m not just interested in the results of internal tagging in one document, but also between.
I work with lots of journal articles, word documents, newspaper clippings, etc. etc. They are all OCR’d. I love being able to tag these, in addition to having them in a formal folder structure in DT2. It means that I have my folder for newspaper cuttings from 1943, and can then dig down using tags to find just those cuttings talking about religion (for example) in 1943. The classify function is great, so is the word statistics. We’ve all used DT for this kind of document management - it excels here.
But then I run into a problem. At a glance, I cannot see which section of these documents I want. Highlighting the word ‘religion’ in the OCR’d text isn’t enough, because there are scenarios in which I want to be able to annotate/tag/flag a section of the text with a thought e.g ‘Rhetoric’. There is no point in tagging the whole document ‘rhetoric’, because then I have to go read it again / open it in a new app to see what I was referring to. In short, I want to be able to see user ‘coded’ tags on portions of the text. And I want to be able to assign multiple, overlapping tags to different parts of the text. [hence ‘highlight’ features are normally out the window]
Moving back to the macro level, the DT power really comes in when you want to dice data. I want DT to be able to generate a document incorporating linked / cited text extracts from other documents. This document would be created according to criteria: e.g. [is in Folder X, with Document Tags Y, Z], but the document would also output the portions of text specifically markedup with (say) the Internal Tag ‘rhetoric’. And it would link/cite. So then I could run this widget, get a new document with all the relevant entries, referenced so I knew where they came from initially.
[imagine another scenario - you ‘tag’ telephone numbers you think are important in a series of documents. You then want to create a cheat sheet, but only want to extract telephone numbers from documents matching specific criteria [i.e. not every document you’ve tagged telephone numbers in]. And you want to know which document those numbers came from]
It would be the ability to carve up an archive in this way which could make internal tagging powerful. Of course, I could achieve the above outside DT by putting everything into one long document, and doing the normal keywording/coding etc. But then I’ve destroyed the data - it’s no longer flexible, and I can’t suddenly decide I only want to see in my new auto-generated document keyworded extracts which come from documents tagged ‘religion’ and ‘science’.
So that was a very long way of saying that lots of people work with large DT databases of information, and the ability to generate quick-summary sheets of extracts, linked back to the original, according to fine-grained criteria, would be really really cool, and so useful. (I think!)