Per this thread, it appears that the “See Also” algorithm doesn’t search the “Name” field of a document. I’d love a checkbox in the preferences that would allow the algorithm to take this field into account.
Thanks for the suggestion, we’ll consider this for upcoming releases of course. Is anyone else interested in this (or usage of other metadata like URL, author etc.)?
Yes, please - name.
Great! I hope so. For the last couple of years, I automatically assumed the “See Also” algorithm included the “Name” data, and, while I suppose I could write a script to copy that data from the “Name” field and drop it at the bottom of each text field, I’m reluctant to muddy the waters.
keywords, tags, subject, comments (Spotlight and document), and author – any metadata that’s likely to be added by the source of the document, esp. PDFs, OpenMeta tags). It would be helpful if the metadata fields searched by See Also could be activated or switched on/off on a search-by-search basis.
korm is (almost*) exactly right. I was very surprised to learn that DTPO doesn’t look at those additional fields already! *One small added thought: Perhaps we might have control over both a default setting of which metadata is considered and also per-search control.
Hmm. If it were accepted that meta-data were a part of the “See also” algorithm (presumably subject to checkboxes in Preferences and search-by-search, as suggested above), a key issue would be weighting. It would be pointless if, say, the name, had the same weighting as a piece of body text of similar length. I don’t know what an appropriate weighting would be; perhaps either/or is the solution.
I presume the weighting would be the same as if the title was part of the block of text—so a six-word title on a 60-word note carry 11% of the “weight” (60+6=66, 66/6 = 11).
If anyone who actually works on the DTP see-also algorithm is reading this, they’re probably laughing at the ludicrous simplification, but I believe this is approximately how I’d want the system to work.
Although this would be possible, another approach might be more useful: Compare the similarity of the titles of item A & B & the similarity of the texts of item A & B and then calculate the average similarity (not necessarily an arithmetic average).
This wouldn’t depend on the length of the title/text whereas in your example the title would be usually almost meaningless as the text is much longer.
Personally, I’m happy with the limitation of See Also to examination of only the contents of documents, excluding metadata such as Name, Subject, Keywords, Tags, etc. I make a lot of use of See Also (and the related AI to see selected text) when looking for conceptually similar documents in a database. And I prefer limiting See Also to a single database, rather than operating across multiple databases, as I deliberately design my databases to take advantage of such specificity.
I don’t view See Also as a means of generating highly specific search suggestions, such as identifying invoices related to names of clients, or cataloging a collection of nuts and bolts in hardware store bins, e.g., by threading, size, metal alloy, corrosion resistance, etc. For those purposes, I would use DEVONthink’s excellent search features, perhaps including tags or keywords as descriptors. When I use See Also, I expect to see a list of suggestions that are related (contextually > conceptually) to a document that I’m viewing, and place the highest value on suggestions of relationships that I wouldn’t have thought of, a Eureka! moment in thinking about a topic.
One in a while, users request means of “tilting” the operation of See Also to change its performance, such as the suggestion to consider a metadata element such as Name (or other metadata) in creating a list of possibly related documents. Christian commented above that it might be possible to modify See Also to also consider similarities among Name content as well as document content, but then, I suspect, users would further request variations of the weightings assigned to Name and Content. And users employ many different approaches to assigning metadata such as Names to documents. Some use a text string related to document content and/or purpose, others may use a numerical or outline naming system, such as 1Ab23 or other variations.
Yes, there are times when I would like to restrict the See Also analysis to only a set of documents that meet criteria that I can specify, such as a phrase in document Names, a date, a tag, a keyword, etc., and perhaps to do that across multiple databases. I might filter documents by filetype, or maximum word count if desired.
First, I would create a search query that restricts the list of results to my criteria. If I might wish to do that query repeatedly, as new content is added to my databases, I would save the query as a smart group (just click on the “+” button to the right of the query field in the full Search window, to create a new smart group).
Next, to use See Also to analyze only the content of documents that meet the criteria, I would copy the search results documents to a new database created for that purpose, select a document and invoke See Also to look for others in that database that are contextually similar. I might retain that database for a time, if it’s useful, or empty it of contents so that it’s available for a similar operation in the future, or delete it entirely.
Yes, this approach is one of my infamous kludges, adapting the existing tools in DEVONthink to meet a desired objective. But rather than “enhancing” See Also by adding analysis of one additional feature such as Name, it’s adaptable to filtering the documents to be considered by See Also for a number of possible criteria, and yet can be done quickly and without creating additional complexities in DEVONthink’s Preferences options (and programming headaches for Christian).