What are good group strategies for helping AI filing "outliers", heterogeneous files

anubis · September 21, 2020, 7:48am

Hello

In each of my 10 bases I have the same issue where I do not know
In any given base I have 85% documents which should belong to (say) n neat groups and given these documents have similar patterns Devonthink AI does its magic to properly allocate these to n neat groups
Problem is 15% documents which are outliers, non-recurring stuff, 1-off documents, heterogeneous, no pattern that would allow me to -even manually- allocate these to a group ,so of course Devon AI is lost on these.
Also these 15% are in general not the most important doc, but I want to keep reference of these just in case.

My question is what is a smart strategy to allocate these 15% outliers without disturbing the whole purpose and magic of Devon AI on the 85% documents neatly filed/grouped:

should I create a “not neat group” to put all the 20% unfilable crap and expect at some point Devon AI understand that non recurring stuff, unfilable crap gets there ?
should I let Devon AI struggle as it is doing today trying to file in current “neat groups” the crap: I fear that by doing that I dilute the homogeneity of my “neat groups” (by letting Devon AI adding some bits of crap there) it might lead to Devon AI gets less efficient with my neat groups at some point.
what other strategies to consider ?

5 years I want to ask this, probably covered somewhere else but could not find
Thank you very much

cgrunenberg · September 21, 2020, 8:10am

The best solution is probably to file the documents in the desired groups on your own and to exclude them from See Also & Classify (see Info inspector) so that the AI won’t be affected.

anubis · September 21, 2020, 9:34am

Thanks much
And how does this solution compare to the following alternative: let AI (and help AI when needed but often that is what AI wants to do naturally) to simply file “outlier files” in the root folder of the relevant base.
In all my bases I have one single core root folder called “ALL Filed” with sometimes 10 levels of sub-folders below.
I have noticed that when AI does not where an “outlier file” should go in any subfolders it often offers to file it in the root folder “ALL Filed”.
Are there dis-advantages letting AI doing this vs. your suggested solution ?
Thanks much

cgrunenberg · September 21, 2020, 9:40am

For the AI this would be another possibility but for you the advantage of my suggestion would be that the files are located in the desired group (e.g. the root group might become really large otherwise).

anubis · September 21, 2020, 10:14am

What is the drawback of having large number of files in the root group ?

Also if any drawback, then so as to avoid top root group to become super large I noticed sometimes AI offers to file into top root group, but sometimes also it offers to file into child groups below the top root group (where these child groups have several subgroups below them but AI -and for a reason- does not offer to file in any subgroup below them because file too heterogenous vs. the subgroups below them).

What would be the disadvantage (vs. your solution) of letting (and manually helping when needed) AI file into top root group (or sometimes into child groups when AI suggests also and there is some sense to that) ?
My objective is to 1. avoid Devon AI gets less efficient at some point with suggestion allocation into my neat groups and where 85% of my files are neatly organized. 2. Not lose too much time with 15% outliers files (and I would rely on DT advanced search capabilities anyway to find these outliers back anyway)

Thanks much

cgrunenberg · September 21, 2020, 11:06am

Technically none. Just depends on the personal habits and needs, e.g. I prefer to have items in the most suitable group and don’t like groups containing lots of unrelated items.

amalis · September 21, 2020, 3:37pm

I have a “Misc” group that I use for stuff that doesn’t fit elsewhere. I usually move stuff into it manually. But occasionally See Also & Classify will figure out it belongs there because it’s similar enough to something else already there.

anubis · September 21, 2020, 4:06pm

Thanks much for sharing @amalis that is helpful.
And what you do is different to what @cgrunenberg initially suggested:

in as much as on the contrary you do not exclude your “misc” group from classification but on the opposite you keep your “misc” group included in classification. And that is because you expect at some point that DT AI will run its magic also on this “misc” group:

That is interesting and another different way…

This thing has been a question for me for 5 years on how to use DT and I see various ways thank you. Although to be honest I still do not quite know which way to chose to finally properly manage the 15% outlier files which are in-compressible, and what proper process to manage any “misc” group.
Thanks again very helpful in any case.

amalis · September 21, 2020, 4:32pm

Sometimes “perfect” can be the enemy of “good enough”. Just pick a method and see if it works for you - mine did for me!