QDA and DevonThink

mlevison · November 17, 2021, 3:57pm

I use DevonThink to maintain a small personal research archive of topics related to software development teams. Since I started a few months ago, I’ve actually read a few of papers I’ve had on my SSD over several years. So one win for DT.

On twitter yesterday I came across the suggestion that researchers now use Qualitative Data Analysis Applications to analyze a large body of papers and find fragments that are interesting and then read further. Ok this seems cool, especially since I will always find research faster than I can read it.

The challenge is that I want to do this beside/in parallel with DT. I want to analyze a number papers I have tagged for: Effective Teams or Leadership.

Has anyone played with something like this? What toolset did you use?

Thanks
Mark

rkaplan · November 17, 2021, 6:42pm

MaxQDA, NVivo, and Atlas.ti are the major competitors in the QDA marketplace.

I tried them all a while back and found each to be impressive in its own way but neither really fit my particular workflow. They can clearly be very helpful if you author content - particularly if you write review articles or scoping reviews or historiography reviews or other types of writing where you survey a large field of knowledge and comment on the pros/cons of others’ work. I may have such a need in the future but not at present so it did not fit my goals at this time. But YMMV and if that is the sort of work you do then they can be really helpful.

mbbntu · November 17, 2021, 7:16pm

I have a background in psychology, so I’m used the idea that NVivo and the like are used for Discourse Analysis. Basically, that means combing through a text in very fine detail, and depending on what you are trying to do, you might identify themes, patterns, etc. in the text, using the software to tag passages, phrases or words that are relevant to your study. It is very time-consuming, laborious, and usually seems to mean reading and re-reading the same text over and over again because it is easy to miss important material on one reading. I get the impression that this is radically different from the kind of use you have in mind. As to whether you could “bend” the software to what you want to do, I don’t know. I haven’t used such software very much, because I found it much easier to code (i.e. put markers and tags in the text) by hand on printed sheets. I guess the best thing to do would be to try one of the programs. I seem to recall they are fairly expensive.

rkaplan · November 17, 2021, 7:41pm

They each have a free trial period so you can see if they work for your use

mlevison · November 17, 2021, 9:25pm

Background I was on twitter and I saw this:

PSA: If you are doing your analytical work like social research, reviewing papers, or adjudicating proposals by means of underlining or highlighting and adding comments in MS Word, or (egads!) on paper, you could REALLY gain from using QDA software

From: https://twitter.com/mloxton/status/1460622846108045312

Of course I asked more and discovered a rabbit hole and so I wondered would this rabbit hole take me a place we will DevonThink++++++; where the computer helps me find the most interesting material to dig into. Currently I have a research archive of 400-500 items. I will never be able to read all the papers in there. SO I was intrigued at something that might help me find the gems.

I admit that Discourse Analysis sounds painful and to be the opposite what I want.

rkaplan · November 17, 2021, 10:20pm

I forget the details of which software has which feature - but the differences are notable among NVIVO, maxQDA, and Atlas.ti. At least one of them in particular lets you set keywords which then automatically tag occurrences in text; the others require that you manually tag each occurrence.

More than most genre of software, it seemed to me that which one you select makes a huge difference.

Another distinction is that one of them makes it easy to capture items from the web and the others don’t.

In the end I deferred this project not so much because I didn’t feel this software would be helpful to me but more because I realized that I really would have to study the interfaces more in depth before deciding which one - it wasn’t like most other software choices where I could identify the clear winner and then gradually learn it over time.

dario · November 17, 2021, 10:33pm

I forget the details of which software has which feature - but the differences are notable among NVIVO, maxQDA, and Atlas.ti. At least one of them in particular lets you set keywords which then automatically tag occurrences in text; the others require that you manually tag each occurrence.

MAXQDA can do automatic coding (tagging) based on keywords. I don’t know about the others (NVivo had a bit of a steep learning curve for me initially so I gave up, and the Mac version is not on feature parity with the Windows version so that was a downside) but I have spent quite some time last year in MAXQDA working on my masters doing the content analysis on a quite large dataset of some 1300 pieces of media coverage.

Autocoding example in MAXQDA:

https://www.maxqda.com/blogpost/research-diary-how-to-use-autocoding-to-enhance-your-analysis

https://www.maxqda.com/help-mx20/lexical-search/autocoding-search-results

mbbntu · November 17, 2021, 10:41pm

Maybe NVivo has acquired automatic tagging since I last looked at it. But in any case, it seems to me that the problem with trying to use AI to help find stuff is that it will usually (unless things have improved a lot) only find literal occurrences of words or phrases. When I was doing some textual analysis a decade ago, I can well remember reading texts in which it was clear that the main theme was deportment, but that word did not appear anywhere in the text, nor did anything like it. It took a human brain (or something like it in my case) to see what the theme was.

DEVONthink’s “See also” is very useful, but it is not capable of analysing a text by Freud and working out that it is relevant to some aspect of Adler’s work, as well as something written by Lacan. Unfortunately, we have to do that.

mlevison · November 18, 2021, 12:55am

Ok thanks to all, I think you have saved me wasting alot of time down this rabbit hole.

dario · November 22, 2021, 8:33pm

I am not sure whether this is entirely relevant to your question, but I used MAXQDA and DEVONthink while working on my master’s dissertation last year, so I have some experience with both; however, I have used them for different purposes. These are two quite different beasts!

I’ve been meaning to post this sooner but had been quite busy these days. You are probably better off with DEVONthink but perhaps this may someday be helpful to someone else as well so here’s my take on it.

A bit of a background as I feel understanding the use case is important: I did a content analysis (CA) on about 1300 pieces of media coverage of a particular project, spanning almost 20 years. The idea was to find the most salient themes with regard to social, economic and political aspects and other issues surrounding the project and to identify the most prominent actors who were actively promoting or discussing them (it was a case study in communications and public relations). The challenge was that I was faced with many documents to read and go through.

As mentioned already in this thread, QDA software is great in that you tag (code) themes, patterns, etc. I have developed an extensive coding system (both inductive and deductive coding were used, i.e. I could predict some codes and themes in advance, but the majority of them have surfaced while actually coding/tagging). I ended up with some 330 codes/tags (to identify themes, patterns, and actors) and have assigned them in total almost 9000 times across the dataset.

Coding/tagging is manual work; there is autocoding in MAXQDA which may help a bit though if you know you are looking for exact keyword matches (see my post above) but as I was looking for themes across a large dataset, not necessarily using the same keywords every single time (this was media coverage after all so people use different words to express the same ideas or thoughts), I did not use it much. If you are certain of exact matches in documents, it could be helpful as a feature.

This is what MAXQDA looks like, with tags (codes) displayed in a document:

MAXQDA is then great in identifying overlapping tags (themes), which was a critical aspect for me as I needed to find out who said what, when, and to whom publicly, so I used MAXQDA to look for overlapping tags representing themes and tags representing one or more actors.

MAXQDA is also great at doing various queries to discover correlations between tags, and there are several helpful ways to visualise relationships between tags (codes), for example, to see how the codes relate to each other and how “close” they are with regard to how often they appear together.

If you need to identify themes that often show up together, to look for cases where a number of tags are mentioned and some others are not, or to actually look for cases where tags are mentioned within some defined distance (proximity), this is an extremely powerful tool for content analysis (or critical discourse analysis, as mentioned earlier in this thread already).

It is also easy to narrow down searches and analyses as you can only work on parts of a dataset (in MAXQDA terminology, you can ‘activate’ only a subset of documents matching a particular query (documents matching specific criteria, including codes/tags etc.) and then you work further only on that subset). MAXQDA also does various reports on tags (codes) relationships which you can export to a spreadsheet and then, for example, import and visualise in Tableau (which I also did). It also has some powerful visual tools of its own (comparing code frequencies, code relations, word clouds, code maps, document comparison charts, etc., see link below). It lets you add metadata to documents which it calls variables (I have e.g. used variables for each document to identify the source, type of media and the date of publication) which can be useful later for further filtering of a subset of documents and running queries.

I don’t think you can do this sort of ‘low-level’ coding, where you may code/tag anything, from only a single word to whole paragraphs or more, in DEVONthink. I wasn’t interested in concordance that much; DT can obviously do concordance, but for what I did, I couldn’t infer much from concordance alone. All the QDA software does various lexical analyses, concordance, etc. MAXQDA is also great if you need to analyse and tag audio or video, e.g. if you have interviews you need to work with, as it remembers timecodes etc.

MAXQDA can do similarity analysis between documents and filter out similar documents but this is based on your work, i.e. on the assigned tags/codes and other data, and not on some AI magic like in DEVONthink so here you are probably better off with DEVONthink if you need the software to suggest next documents to read based on text similarity with DT’s AI magic happening in the background.

There are also good resources and tips on using MAXQDA for literature reviews and reference management on their website which seems like something you may want to do. Having used it to analyse documents, I can certainly see the appeal of using it for literature reviews. I am not sure that I would choose it over DEVONthink + Zotero or Bookends for this particular task as DT is (much) better at working with PDFs, annotations etc., but MAXQDA does have a few advantages for that use case where I see how it could be helpful if you are faced with a large set of texts.

Much like DEVONthink, MAXQDA can either import data (such as PDFs) into its database or it can keep files externally. I learned that keeping them externally has the benefit of much faster operation than importing close to 2 GB of PDFs into MAXQDA (I found MAXQDA to be somewhat slow when faced with large database size, and my then-Intel MBP with 8 GB of RAM was not handling that particularly well; once it was relying on external files it worked well even for very complex queries and searches as it actually just needs the external files to be able to display them and uses its own database for everything else).

Keeping the files externally also had the benefit that I could index that same folder in DEVONthink. DT’s search was helpful when I needed to find a document quickly when I wanted to check something or couldn’t remember where something was. I also found DEVONthink to be helpful in identifying similar documents to help me get a general feel of this quite large corpus of texts before even starting work on coding. MAXQDA lets you do memos (notes) attached to the documents but they are kept separately in MAXQDA’s database file as is the coding (MAXQDA does not touch the PDFs themselves) so you are free to keep using DEVONthink for annotations on PDFs if you prefer and you can essentially work on the same dataset in both applications. Just to be on the safe side, I created duplicates in DEVONthink for the key documents I wanted to annotate there as I did not want to risk ‘confusing’ MAXQDA with the ever-changing PDF documents.

Where I have used DEVONthink, except for managing my readings and bibliography, was for cleaning up the dataset and the initial triage. I initially had well over 7000 documents as I had collected media pieces from several databases so there were many duplicates (essentially a complete mess that I thought would take me weeks to clean up). After some manual sorting, DEVONthink helped here as it did identify some of the duplicates where text was nearly identical from the outset, and I could use search to find other duplicates based on article titles or other content. In the end, I was done with cleaning that up in less than a week.

The issue is that QDA software is generally quite expensive unless you are able to get a student discount. MAXQDA Pro license valid for two years with a full set of features including all analytics and visualisations available and with a student discount is actually quite affordable.

NVivo is similarly priced and has similar features but had a bit steeper learning curve for me (including using a bit different terminology which complicated things for me as the first-time user of QDA software). For some reason, I liked MAXQDA better from the outset, it just felt like I was able to jump into doing my research more quickly, but these may have been entirely personal preferences. The issue with NVivo was also that there was a discrepancy in features between Windows and Mac versions (the Mac version lags a bit behind, and I did not want to risk any compatibility issues in case my supervisor wanted to have a look into my research database at some point). There was a new release of NVivo in the meantime, so they may be more on par with features between versions now; I don’t know exactly whether this is the case as my student license did not cover the upgraded version.

The other issue I potentially see with QDA software (but based on just this one large project) is that it is not generally as extensible as DEVONthink (in terms of supporting scripting etc.) so automating anything relying on external applications and integrating stuff seems difficult — though I may be wrong here as I did not try to automate anything but it does not seem to support scripting.

MAXQDA has an excellent online manual with lots of screenshots which may help you to get a quick insight into what it can do so that may help you to judge whether it would be useful in your case.

Apologies if this long post is not entirely relevant or applicable to your use case but it may perhaps prove useful for someone using or considering using MAXQDA and DEVONthink on a similar project in the future.

MAXQDA Visual Tools:

MAXQDA Manual:

mlevison · December 2, 2021, 6:22pm

Wow - that helps me see how these would be useful and aren’t useful for me.