DEVONthink for academic research - ADVICE

Thanks @msteffens for pointing that out!
Also interested to see that integration come to life.
Though I would favor staying in one app as much as possible, that one being DT atm.

Great question, @ngan! It’s often taken as given that we all develop an idiosyncratic process/workflow that is built around the requirements of our own fields/projects/methodologies. While this is true, by not talking about them (and asking these kinds of questions!), we’re missing out on important opportunities that could make our own processes work more effectively. In fact, it was while searching for answers to a very similar question that I first stumbled across DT.

@SebMacV, I had never come across Zettelkasten before you mentioned it earlier in this thread – and, of course, a simple Google search took me down a very deep rabbit hole. It seems like a pretty solid technique, and one that DT seems suited to. What was your experience of it?

1 Like

does the imac have an ssd?

Thanks to @rkaplan, @SebMacV, @benoit.pointet, and @richard.d for sharing the experience.

It seems that the search for a suitable method to recall knowledge is an ongoing process/challenge. I have been experimenting to use a custom script for taking note snippets on a journal article/book. By using the script, those snippets are consolidated under its original source on the one hand (it is almost like each article/book is having its own shoebox of index cards) and all snippets from different sources can be filtered and recombined into a consolidated summary by way of tags-selection or keyword-searching. I found this method quite effective for me.

However, while I believe that I have largely solved the puzzle from a technical aspect, I also found that the real challenge is methodological. Even when I have a highly organised superset of shoeboxes and the index cards in each shoebox can be recalled effortlessly by any combination of tags and search, the challenge is how the tags and keywords should be designed. In other words, the system of knowledge-retrieval is only as good as the way a researcher is categorising knowledge in their mind. Here, I think we are potentially in a catch 22 situation: we couldn’t have established a well-thought categorisation of knowledge in our mind at early academic career, and it would a rather impossible task to rearrange/redo our knowledge/notes once we have established such categorisation. Perhaps AI will take over the task of knowledge retrieval for us in the near future, while a human can keep focusing on seeking the right question and the methodology for answering the question.

Just sharing my 5 cents.


Indeed a catch 22.

But don’t underestimate the value of the process of reviewing those boxes and maintaining your system.

I use heavily replication and smart groups, have 1000 different topics covered in my main reference DB and everytime I take the time to curate/review a corner of it, I find it an immensely useful experience. Like clearing my mind.

First an aside in response to bosie’s question, one advantage of the Mac is that you can easily run MacOS from an external drive. I have a 2011 iMac, with HDD, but to speed it up I have plugged an external SSD into the Thunderbolt port, and it runs a lot faster.

But the main point is to echo the point that DTP can be used in different ways, and that cgruneneberg’s suggestions for searches should yield some useful pointers. I have been using DTP for ages to manage the photos I take of documents in archives. For me the main function was processing photos of single pages into multi-page searchable PDFs and then searching for documents on certain topics (unfortunately, with 50-year old carbon copies, OCR often fails, so manual search is necessary as well). It is excellent for this, and my basic group structure corresponds to the archives (as in National Archive, Harvard University Library, Duke University, …), Collections, Boxes and Folders from which the documents were taken.

I also dump a lot of other stuff into DTP, because it is searchable and I won’t (I hope) lose it. I started with a single database, though when it got too big, I split it, because it was just a pain moving and backing up a file that contained over 100GB of data. This is one advantage of indexing files. I have something like 120GB of documents, but the database indexing the lot, enabling me to search, add keywords, create and group replicants, is under 7GB.

The slow but thorough method for me was to dump stuff I am going to use from DTP into a bibliographic database, first Sente, until support for that ended, and then Bookends. I say slow, because if the documents are “home-made” from photos, bibliographic data has to be typed in for each one, and that is a slow process even with software such as Textexpander to provide shortcuts. So my workflow was DTP then Bookends and then into Scrivener (very highly recommended for the final writing). Now I am trying to identify more relevant stuff in DTP being more selective in what I drop into Bookends.

Another problem was the note-taking. I have done some in Bookends, and I created dozens of Groups in Bookends, though was not finding that perfect. So I began to explore other options, such as linking all my Bookends references back to DTP, so I could use DTP to sort stuff into groups, and create notes that would be the basis for writing. Avigail Oren and a colleague have developed a workflow that looks very interesting if you make much use of archival material (her target is historians, and you should find it via the searches cgrunenberg mentioned), and had I been starting from scratch I would probably have adopted it, but I decided that jumping to it mid-stream, when I already have around 28k refs in Bookends, and 70k docs in DTP, does not make sense.

Instead, I decided to add Tinderbox to the mix, and to try using that to organize materials and notes. The key is dragging references from Bookends into Tinderbox, where I can create a visual map of how documents link together, along with other notes I create. It seems to be working with the one topic I have tried it on, though I have yet to find out whether it will continue to work as I venture into other topics and try to create a map with far more complex interlocking networks of connections linking the various topics within my project.

So there are reasons for holding data in three apps, because they all do different things well, and for me the key is being able to go easily between a document stored in DTP, Bookends and Tinderbox. I do a search in DTP, find stuff that I either locate in Bookends or add to Bookends, from where it gets put into the network in Tinderbox. A bit chaotic, but as I have said, it proved harder than I thought to switch to a more rational system, making better use of DTP’s capabilities, mid-project. Hence my interest in picking up ideas from this thread.

1 Like

@msteffens That does look very interesting. I’ve been looking for the right tool to extract PDF annotations for a very long time and this would be a powerful possible solution. However it doesn’t look like much progress is being made? Nice professional looking website made with last blog post in 2017 and forums with no traffic and last posting in August of '19. I signed up for the notifications email list but am not very optimistic??

@greasemonkey Thanks for the feedback. The app is under continuous development, and I’m very much dedicated to bring this app into release state. However, progress is unfortunately rather slow due to this being a part-time effort, and due to real-life issues (like multiple losses in my family). Still, I’m working on this as much as I can.

1 Like

Besides keeping my notes atomic, self-contained and properly tagged, it also helps me to directly link notes with each other. In addition, I find it helpful to add overview/structure notes where I can organize individual notes for a certain topic/project into a hierarchy, and add sub-headings, comments, etc.

The goal is to create a network of semantically rich and linked notes. This network can then be visualized which further helps me not to get lost. Here’s an example which visualizes some reading notes.

1 Like

What software do you use to create that visual map - Keypoints?

Is there any way for you to share a set of notes in such a diagram format, including the links - or is the end result only for your own personal use?

I am pretty much in tune with @SebMacV who gave a nice outline. I have two databases several Groups in each, I go two levels deep, almost on principle. That is Groups within a database and a group within a group. I don’t even use tags now. I find search does the work for me and I just put everything in DEVONthink 3 for academics and research. I don’t index and blah di blah. I have way too much stored too. I use Houdah Spot via Alfred and I can find anything easily that way I have found, so much so that there are one two things recently I had forgotten to put into DEVONthink 3 and I hadn’t even realized, I was looking and opening them from HS and Alfred.
I do have a prefix system for anything I write myself, or heavily edit. That is in the file name, is quite complicated but used by me for years now and again I can just ‘call it up’ easily. I could, in truth do it with DEVONthink 3 and Finder alone these days probs. Big set back for me was when Apple shielded Mail from HS etc. I wish that was back. Though if anything is of purely research interest I just put a mail into DEVONthink 3 and it gets found that way. There is no clear principle on that one though and I will lose something one day. Good luck with the research.

1 Like

Yes. It gets dynamically created from your currently displayed notes as you browse, filter & select them. This little screencast shows this in action:

The end result is a regular PDF (generated using Graphviz). The elements in the PDF have keypoints:// URLs (so that clicking any of them will select the corresponding note in Keypoints). But I could also envision an exported PDF version where the URLs have been set to the corresponding x-devonthink:// URLs, or similar.

Also, since this is done using Graphviz, I guess that a script could also generate a similar graph output directly from any inter-linked DEVONthink records.

Or did I misunderstand your question?


Have you thought of a way to publish such a graph on the web, either publicly or privately?

The one stumbling block I realize I have encountered when I have looked at similar programs previously is that there is no good way to share my map or linked notes with a colleague or client. I would think that most academics or proessionals using something like this would want to share their work product with a student or another professor or as a post on a discussion forum or with a client etc. It looks gorgeous but if only I can see it (or if other people have to set up special software to see it) then that limits its use considerably.

Yes, in the future, I’d like to enable sharing via static site generators. The generated graph could be part of such a shared collection of notes, and I think that it could be made interactive. Graphviz, for example, supports a wide variety of output formats, some of which (like image maps & PDFs) also allow to add URLs to nodes & link labels.

A more straightforward way of sharing such a graph could be to embed all notes within the PDF, e.g. as PDF annotations on top of their corresponding graph nodes.

In any case, I agree that the ability to share one’s network of linked notes is important (and likely for some a hard requirement).

Could we take a step back? How do you use DTPO for reading PDFs when a single article may be read differently depending upon what one is thinking about (switching hermeneutics)?

I currently don’t use DT for storing PDFs, though it appears I’m missing out. At the moment, I have multiple copies of articles in different folders in Finder. This prevents one reading of an argument from interfering with another. I think if I use replicants, the annotations would just pile up, no?

My contribution for Richard.d.: I use DT for 43 folders. I find other methods of tracking tasks don’t tell one when something needs to get done. Scrivener is for writing, of course. This academic year, I’ve begun using Scrivener for preparing courses, as well. Too early to tell if this is a good idea or if I just lost a bunch of time transferring notes.

I might add that, yes, I use DT for “all” of my research notes. I still have some great analogue notes in files that need to be scanned and uploaded to DT. I’ve already done this with some essential items.

I’m also playing with the idea of taking notes by hand in Goodnotes, and then uploading them to DT. I just like reading and thinking with a pen in my hand. The problem is the handwriting recognition is not as good as recognizing typeface. If I change my mind about the notes, I can still have them up on the screen while I type them into DT.

I think if I use replicants, the annotations would just pile up, no?

A replicated file only has one associated Annotation file.

Yes, of course - I couldn’t agree more. I don’t much hold with the idea that once you read a source and take ‘good notes’ (whatever that means), you don’t ever have to read this again. I return to the same thing often for different reasons.

Another example relates to most of my primary material. If one might dream you could extract all your need in terms of secondary materials, my research is on literary texts, much of early-modern poetry. The notion that you can ever really ‘exhaust’ a poem is of course ridiculous, and so I need multiple ways of reading and annotating that need also to be continually updated, as and when my research questions change; as I learn more; as I notice different things, and so on.

I find myself quite guilty of looking to design a perfect note-taking system to capture a reading experience that is in itself very organic and so very hard to organise without closing down future interests. I think I can do much better at keeping records and making my thoughts from any given time more retrievable and useful (this is my current mission), but I’m under no illusions that I can pre-empt the questions that future-me would like to ask!

@ngan’s response above comes perhaps closest to what you suggest and have solved in a different way (by keeping multiple copies of the same thing in themed folders). I’ll be trying to implement a version of what @ngan describes: keeping reading notes anchored to the source, but extracting from these the themed snippets to allow for another way of cross-referencing. Either by adding these snippets to a themes folder (shoebox, if you like), or tagging. Tagging appeals less to me though since you can only tag a document and not a section of that document: so, here @msteffens ‘atomic’ is the key at least for extracted snippets. Keypoints looks very promising!

Anyhow – good luck folks. I better go do some reading rather than design this note-taking process to death :wink:


Goethe said: “Ordinary people don’t know how much time and effort it takes to learn how to read. I’ve spent eighty years at it, and I still can’t say that I’ve reached my goal.”

Thank you SebMacV!


Many thanks to @richard.d for starting this thread and to others for their contributions to a very interesting discussion. @SebMacV, I really enjoyed reading this article by Keith Thomas, much appreciated!

I’m in the process of writing a PhD in the humanities and wanted to share a bit of my working method, and how I am thinking of developing it in the future.

I am committed to Zotero, so I keep all of my PDFs there, using ZotFile to manage the file naming. Every time I read an item, I add a Word doc to the Zotero item. If I’m doing a good job, this document includes both a running summary of my notes (including relevant quotes with a page number, unfiltered reactions, and so on) as well as a paragraph that summarizes the text. I started using Word for this task because I liked the convenience of adding my Zotero citations right away, so that when it came time to writing, I would simply copy and paste and be done. Now I’m moving to Scrivener for my writing, so this seems less relevant. Like @SebMacV, my plan is just to do the Zotero citations once I compile into Word. (I played around with Scrivener/Zotero integration but nothing was satisfactory.) I’m sure that taking notes in Word may be a horrible idea, perhaps I could be convinced to switch to rtf at least, but I’m not about to leap into markdown. Either way, I do spend time poking around folders in Zotero, so I like having my PDF and the notes right there. I have a color coded highlighting system across PDFs and notes for “regular highlight,” “proper name,” “claim,” “meta/high level argument,” “transcendental/highly quotable text.” I use Preview to annotate PDFs simply because it’s easy to select these colors.

On the DT side, all my my files in Zotero are indexed, and the file extensions from Zotfile pick up my Zotero folder names, so that is data I can use either to sort items into relevant folders on import, or to make smart groups later. I often start my notes off with a high-level hashtag that I just type in to the body of the text (#project) so that every new reading note I make can get sorted into an top-level inbox of notes for that project. Right now it’s mostly just my dissertation so I catch everything there.

When I first started using DT a few years ago, I was scared of using replicants but now I think redundancy is extremely helpful, precisely to deal with the important question that @ngan raised, namely how to retrieve the stuff that you put in. I use tags very sparingly, but will replicate items (reading documents, or notes that I’ve jotted to myself) liberally in groups. I guess I have always been comfortable with a file tree structure – the old Windows Explorer folder view was ingrained in me from long ago so that’s just what I like.

This discussion (and the Thomas article especially) has made me realize that I should stop keeping things self contained in my reading notes and break out important quotes into separate note files (would definitely do this in rtf not docx, I’m not a monster) that would live in relevant topic folders. That way I could do more of the quote shuffling from which to draw an argument.

Basically my philosophy these days is to try to trick myself as much as possible to move my eyes over what I’ve actually written. I keep a monthly journal document, in which I’ll jot completely random ideas, copy in relevant emails I’ve written to people, transcribe voice notes I leave myself with DTTG, anything else that crosses my mind. Sometimes I start writing an article annotation in here and copy it into the new file. This thread is making me realize that I should copy more stuff out of it. Anyway, at the end of each month I print it out, hole punch it, add it to a binder with all the other entires, and then annotate it by hand using a highlighter and colored pens. Some of the best insights for my project have emerged from this process!

This is all to say that I’m always experimenting with things, but I feel like my system (such as it is, lol) is moving in the direction of better retrieval. I really like the idea of breaking more and more things out of my longer documents, just to have more things floating around for possible discovery through keyword search, AI suggestion, or just simply going back to look at all the items I created in a certain date range (i.e. to see everything that I was looking at / thinking about, in a global sense, when I was working on one thing or another).

I can’t believe I actually typed all this out, but I will be happy if some part of it helps someone…