Suggestions/wishes about Concordance

Like I said in another thread (read here), Concordance is, by far, the feature that I use the most.

3 suggestions/wishes:

A case sensitive/case not sensitive option (I’m repeating myself, I know)
A filtering option, with words to exclude, for example grammatical words (a[n], the, in, this, these, etc.)
DT Pro indicates that the word xxx is found 251 times but I’d like to have, in the drawer listing, the number of occurrences in every file I have in my database.

I once wondered how many times the word “the” occurred in each of the approximately 20,000 documents in my main database. :slight_smile:

Just kidding. But as I rarely use the Concordance itself, I’m curious about how users work with it. Database Properties lists 427,179 unique words in my database, so I find the scrolling list a bit daunting.

I do very often track terms in my documents. In the Search window I find uses for the Context button, which lists ‘topics’ in the search results. I often select a word and press Option to look at a list of other documents that contain the word. Or the Words button for an open document, which is a kind of mini-concordance for that document and lets one play around with some of the words and other documents that contain a chosen word.

I’m usually looking for relationships between items (whether they be facts or concepts) that I hadn’t recognized.

That’s exactly why, IMHO, words like this one should be filtered.

Same here. That’s one of the reasons why I purchased DT 8)
The more I use the Concordance, the more I like this function. Especially with its search window and the drawer listing. To paraphrase the way young Americans speak on the internet: Way cool! :laughing:
Seriously now, I don’t understand what the graphic does??
And what does “weight” mean here?

The ability to see where each word is in a text would increase the concordance’s utility. An easy example of this is the way that OS X’s Preview displays a list of the terms and where each manifestation is located in the selected text or file; one can then click that selection and be taken to it immediately. In other words, what is proposed is a merge between the current concordance (as currently available for an individual document) and the “Find in Database” search utility, but in which one could search for a term per text, not only by group or by the entire database. This is the normal use of printed concordances (books): statistics and a listing of where the word occurs.

The above would become even more useful if one was able to select multiple files, or a group or groups, and have the concordance provide information about the selected group/s or files. The current limitation of either single text or entire database limits the concordance. For example, I have certain groups in which an entire book is in a group, and each chapter is a separate file in that group, or the chapters are further grouped into subgroups. As it stands now, I am unable to use the concordance on this “book”, which means that the concordance cannot be used as a true concordance on all of my texts. (I know there are some work-arounds, such as exporting the groups to their own database, but this is not a request for work-arounds, but rather for fluid and intuitive solutions within a user’s main database.)

Agree. That would be a very cool feature. Maybe it could be as a third button next to “Search” or “Similar”; a “Context” button or something?

I often need to see if a word is close to another word. Or see which words are close to a special word? Would it be useful if I could click a word in the Concordance list, and then use a slider to see which words are close to it.

Example: When slider is a 5 it shows a new list of words maximum 5 words before or after the one I have selected.

Also - I find it strange that Concordance lists a word the number of times it’s replicated. I have gotten the explanation for this in another thread, but still think it’s strange.

If I put this sentence in a RTF file…

“Ibsen was writer.”

…it shows 1 instance of Ibsen in the Concordance list.

If I then replicate it 10 times to different folders, “Ibsen” now gets a Frequency of 10.

If I had duplicated “Ibsen was a writer”, that would have made sense to me. But since I’m replicating it, I don’t understand why it counts every replicate as a new entry.

If I edit it to “Shakespeare was a writer”, Ibsen now gone from the list, and “Shakespeare” has 10 entries.

BTW, Concordance is now a word on Amazon book pages too. Here’s an example: … oncordance

It shows nice statistics on words in books that are searchable in fulltext on Amazon.

It has a Readability section, showing the different indexes for books, and rating them compared to others. I’m not sure if this would be any use in DT, maybe for writers?

Then a Complexity section, showing how many complex words, syllables per word, and words per sentence.

Number of shows characters, words and sentences.

And finally Fun stats, showing silly things like Words per dollar (!) Words per Ounce (!!!).

Why does that seem strange?

DT Pro is keeping track of all the words used in the database. By replicating a document containing the term “Ibsen” or “Shakespeare” multiple times, your have made use, perhaps, of the ability of DT Pro to place your document in multiple locations in your database. DT Pro must keep track of that. Would you find it strange if you had used the duplicate command instead of the replicate command? In either case, your have increased the frequency of the occurrence of the term in your database.

Also remember that the term “Concordance” is used in many ways, to different effect. Amazon uses the term differently than does DEVONthink Pro. Neither uses the term as it is used when one is talking about linguistic research.

I did an Internet search for Concordance software. Based on that search, one conclusion is that there’s not much of a market out there, so that expending resources to make DEVONthink’s linguistic analysis features more powerful for linguistic analysis purposes wouldn’t be wise and would detract from development for its primary focus. DEVONtechnologies is a small firm.

I rarely recommend an alternative or supplement to DEVONthink Pro, for obvious reasons. :slight_smile:

But here’s a reference that might interest you: It provides an interesting introduction to the problems addressed in approaching textual analysis (read development time and effort), the current status (and limitations) of approaches and at the bottom links to software for Mac OS X.

One of the most serious limitations I would see for using DT Pro’s Concordance for anything approaching linguistic analysis is that the Concordance can’t handle phrases. It you need to do that, take a look at the supplied URL and do some additional Web searches.

Yes, I still find it strange. If i Duplicated it, I wouldn’t find it strange.

But let me think about. I’ll check out the links, and see if I learn something new (I’m sure I will).


It will probably cost thousands of manhours, and basically help two people – Cheepnis and me – but I’d like to second Cheepnis’s suggestions…

It’s all I’ll ever ask for. I promise.

Incidentally, I find the concordance a great Devon feature and easy to use.

The ease of importing Word documents makes Devon a dream (other concordance programs I’ve used require that you first convert Word files to .txt files). It’s a cinch to check for frequencies of words in a number of documents at once. And it’s easy to add and remove files from the list as documents are updated.

My work doesn’t involve finding the words in context–however I do need to check for the number of uses of a list of words. Devon makes it easy to do this with the “groups” column in the concordance. (I just make a separate group with a note containing the list of the words I’m looking for. I then sort the concordance by group, and, viola, there’s a list of words I’m looking for right at the top, along with their frequencies – true, with one more use, but, hey, what do I want? Perfection?)

But, like Cheepnis, I’d love it if – when I double clicked that word, and I watched that clever little drawer slide out – there was not only a list of documents that contained the chosen word, but also the number of uses in each of those documents.

Cheepnis is also right about the capitalization thing, it’s a drag, and I could mention that there is also a problem with contractions. But on the whole the concordance is a great feature and with a few tweaks it will not only help the two of us, but I would be able to recommend it to people I know who are doing the same kind of work.