Coding & Qualitative analysis

mcc · May 14, 2009, 7:19pm

DevonThink’s tagging, classifying etc. features are first rate, but tags can only be applied to entire documents, not to text selections within. But it is surely be an a highly desirable feature to be able to tag specific sections of text within documents, and then view tags on a per-document and per-database basis. [I hope I’ve been clear here]. This basic ‘coding’ functionality is almost entirely lacking for Mac at the moment (expections being 'Hyperresearch and TAMS), with most qualitative analysis tools being a) restricted to pc and b) hugely overengineered. The ability to tag within documents, like an intelligent highlighter, could significantly add to Devonthink’s appeal (at least to me), and I presume to other people working with qualitative data.

Does anyone else agree? Would this be easy to implement?

mcc · May 14, 2009, 7:24pm

This thread is a rather clearer attempt to say what I wanted

viewtopic.php?f=7&t=7202&p=33591&hilit=analysis#p33591

cgrunenberg · May 15, 2009, 8:46am

There have been few requests for QDA years ago but it has been quite silent lately. Therefore we don’t have such plans currently.

mcc · May 16, 2009, 10:40am

Thanks for the response. I’m not sure, however, that I entirely agree with the presumption behind such a decision.

Presumably, the features which appeared in DevonThink 0.1 weren’t things people had requested. Presumably there are tools, such as the ability to tag pieces of text within documents, whose obvious advantages are only realised when people have the opportunity to use them in action. Presumably this is what innovation is about - providing a way of doing things that was previously not available / thought about, and seeing users take on the ideas. Hence we have a DT scenario usage forum…

To be clear, I’m not asking for full blown qualitative analysis software - I’m asking for the ability to tag certain sections of documents. I think this is a powerfull and useful tool with many applications - do other forum users imagine they could use this kind of thing?

Thanks.

cturner · May 16, 2009, 12:17pm

Hi mcc-

I had a quick look at TAMS, a very interesting program. I probably will spend some more time working with it because there are some good ideas I could adapt to my work.

To crudely boil down it’s features, it’s really a tag-based editor with some specialized ways of displaying the results of Regular Expression searches. As such, the semantics of a TAMS tag is very different from what DTPO is implementing.

I’m not sure why you couldn’t index TAMS files in DTPO, or use Nisus’ macro and style capabilities, or for that matter TextMate or a WYSIWG html editor, to produce the type of tagged files you want in DTPO?

The real question for you is: “what does DTPO bring to the table to leverage this kind of work?” I can’t see that it brings much, really.

Devon could enhance their RTF editor to provide these features, and enhance/augment their search to provide more Regexp, but is this anything more than creating MS Word-like bloat in their fine application?

Best, Charles

mcc · May 16, 2009, 3:36pm

Thanks for the reminder about Nissus: it has matured considerably since I last looked, and I imagine you could use bookmarking and a clever macro to achieve approximately what I want, at least within documents.

To your [very good] question: ‘what does DTPO bring to the table to leverage this kind of work’.

Well. Here’s a scenario in which I think the powers of DT would be tremendously useful, largely because I’m not just interested in the results of internal tagging in one document, but also between.

I work with lots of journal articles, word documents, newspaper clippings, etc. etc. They are all OCR’d. I love being able to tag these, in addition to having them in a formal folder structure in DT2. It means that I have my folder for newspaper cuttings from 1943, and can then dig down using tags to find just those cuttings talking about religion (for example) in 1943. The classify function is great, so is the word statistics. We’ve all used DT for this kind of document management - it excels here.

But then I run into a problem. At a glance, I cannot see which section of these documents I want. Highlighting the word ‘religion’ in the OCR’d text isn’t enough, because there are scenarios in which I want to be able to annotate/tag/flag a section of the text with a thought e.g ‘Rhetoric’. There is no point in tagging the whole document ‘rhetoric’, because then I have to go read it again / open it in a new app to see what I was referring to. In short, I want to be able to see user ‘coded’ tags on portions of the text. And I want to be able to assign multiple, overlapping tags to different parts of the text. [hence ‘highlight’ features are normally out the window]

Moving back to the macro level, the DT power really comes in when you want to dice data. I want DT to be able to generate a document incorporating linked / cited text extracts from other documents. This document would be created according to criteria: e.g. [is in Folder X, with Document Tags Y, Z], but the document would also output the portions of text specifically markedup with (say) the Internal Tag ‘rhetoric’. And it would link/cite. So then I could run this widget, get a new document with all the relevant entries, referenced so I knew where they came from initially.

[imagine another scenario - you ‘tag’ telephone numbers you think are important in a series of documents. You then want to create a cheat sheet, but only want to extract telephone numbers from documents matching specific criteria [i.e. not every document you’ve tagged telephone numbers in]. And you want to know which document those numbers came from]

It would be the ability to carve up an archive in this way which could make internal tagging powerful. Of course, I could achieve the above outside DT by putting everything into one long document, and doing the normal keywording/coding etc. But then I’ve destroyed the data - it’s no longer flexible, and I can’t suddenly decide I only want to see in my new auto-generated document keyworded extracts which come from documents tagged ‘religion’ and ‘science’.

So that was a very long way of saying that lots of people work with large DT databases of information, and the ability to generate quick-summary sheets of extracts, linked back to the original, according to fine-grained criteria, would be really really cool, and so useful. (I think!)

cturner · May 17, 2009, 12:46pm

Interesting. I have a 5000 item newspaper database that I’m working with. My approach has been to highlight (excerpt) the atricles in Skim and import Skim text files plus the DTPO database link into Tinderbox. I have the important parts of the newspapers, and the original is only a click away via Tbox’s URL link. I can always go back, highlight some more, and paste into my Tbox note.

I was originally hesitant to move out of DTPO, but after I had a few hundred items in Tbox, I realized the logical separation was really an advantage.

Keep in mind that Skim, when you highlight a passage, will put the underlying text in a note, but you can then edit the note anyway you’d like. Highlights can overlap, but it’s not easy (even possible?) to change colors.

Also, I’d think that macros plus styling in Nisus would enable you to do a list of colored tags a la TAMS.

I’ve also been doing this via Keyboard Maestro for Tbox, and your pointer to TAMS was very enlightening for me. I could bank quite a few tags in its macro window.

The biggest issue for DTPO with implementing a TAMS-like process is its (understandable) lack of full Regexp support. But I think you could use Nisus and its “search files in project” feature to either search for tags in a external database that DTPO has also indexed, or perhaps even the RTF folder inside a DTPO database. Maybe an Applescript even to put the result in a DTPO sheet.

Best, Charles