Academics in the humanities (non-programmers at least) tend to think of metadata like the fields in bibliographic software or a library catalog database: a book (the “data” being the content of the book) has data about the data (“meta-data”), i.e. an Author, Year Published, Publisher… which is separate from the content of the book itself. In historical research, if I have a transcript of a letter, the metadata could include the bibliographic info (the book it was published in, the book editor, date published…), and it should be separate from the content of the letter, and also (in a pure sense) separate from whatever tags or keywords I might assign the content of that letter - those tags/keywords for a single file are much more ephemeral, whereas the metadata in this scheme shouldn’t really change once assigned. That usage may differ from how programmers use it, but it’s somewhat compatible with at least some of the metadata OS system properties (Author, Title).
Having requested the feature, I come from relational database-land (MS Access) where you can indeed define valid formats for a field’s properties and create any number of fields. For example, in the case of historical research, I’d ideally want a variety of date “fields” for each document: the system’s Date created, added and modified fields for each file of course, but additional possible date fields like Date document written, Date document received, Date of event discussed in the letter… They really need to be separate fields to keep them straight (a date is not a date…), vs. just throwing a bunch of number strings in the Spotlight Comments or in the title of the document. Hopefully the software is smart enough to know that they are actual dates and not just text or plain numbers. Check out my blog http://www.jostwald.wordpress.com for more detail on what historians would like to see.
As for how to implement it, I’d really want these fields to (at least) display as columns in three-pane view, like the system meta data fields currently can. That way you can sort documents by the various dates. For example, even if DT’s AI finds a whole bunch of documents relating to the same topic, I want to examine them over time, by location, by author, by recipient… (i.e. sort them by column). I don’t think you can sort results by group, and the current Tag column setup with all the tags displayed in one column (and no way to control the order of the tags) is of almost no use if you’re using multiple tags.
Stepping back a bit: As I describe in my blog post Organizing with Devonthink, I think Devonthink gives you six different ways to categorize things - 1) DT groups; 2) DT tags; 3) system metadata file properties like Author, Subject, Keyword; 4) Spotlight Comment; 5) naming conventions; and 6) putting some kind of keywords in the file’s content. Problem is, I need another type of organizational scheme because either they are intended to serve specific functions or they have serious limitations.
We want to use #1 groups for topics/subjects for the AI, which also should rule out #6 (store source information in the content) - I want AI to find things by content, and not be confused by the fact that two different letters came from the same source, information I already know. It’s not clear whether the AI would work well with several different ‘layers’ of groups and it would be more confusing for the user: one layer for geography (Geog group1=France, Geog group2=England…), one layer for chronology (Year group1=1700, Year group2=1701…), one for topic (I currently have a few thousand groups just for topics), one for authors (Au group1=John, Au group2=Fred…), one for recipient (Recpt group1=John, Recpt group2=Fred…)…, with each document replicated in each set of groups. Practically, I don’t need to use the AI for the bibliographic info anyway.
We’ve been told #2 tags should not be used for bibliographic information (because they may be the only place where some records are stored). I do it anyway, but I’d be happy to switch them to metadata fields if those metadata fields were more easily editable (and usable).
#5 naming conventions get too long to read (in search results, columns and smart groups) when you start tacking prefixes on. Ideally the title of the file will summarize the content so you can see that summary in your search results rather than having to read through every file. And you can’t sort by more than the first prefix (again the sorting problem - we want to categorize our results after using the AI, and sometimes without using the AI at all).
That leaves #4 Spotlight Comments doing a lot of work, but many academics have multiple types of keywords/metadata: bibliographic info, keyword info on geography and chronology of the author and the recipient and the subject under discussion, and of course the topics in groups so the AI can work… The bibliographic info and other metadata can’t all fit in the Spotlight Comments (plus, same sorting problem as with Tag column in 3-pane view).
The #3 file system metadata properties (Author, Subject…) would be perfect because there are numerous, short, distinct fields - and some of them even have the proper names like Author… But they are barely editable within DT and with Applescript, not to mention you can’t use most of those fields in most document types anyway (only email files, which can’t be created in DT). I’d want basic ‘metadata’ fields like: author, recipient, date, place, and a few custom fields would be great as well.
So when people say they want “customizable metadata”, they’re likely thinking of a different way to categorize the data, akin to OS’s metadata fields, but more controllable in DT. I’d be happy if they were native to DT and not reliant on OS X - that’s already the case with tags and groups anyway.
Sorry to take so long, but I’ve written longer!