I assume you are choosing not to use DEVONthink Groups and Smart Grouos with the imported (not indexed) files? Seems by not doing this you are missing a trick
That is what I do with my books on the “Books” database.
What happens is that I find too many duplicates of the same files that naturally take up space; to use the metaphor of the bookshelf… if I use a book for several different researches it always remains on the same shelf it is not “duplicated” every time; surely it’s my fault, but…
Say the Baseball Smart Group rule looks for titles and or content with words like “MLB”, “Baseball” and Kind is any document. Just standard DEVONthink techniques. And of course I have Groups defined. Just standard and powerful DEVONthink stuff.
Database “hugeness” doesn’t require indexing nor does indexing make it less huge.
sometimes I prefer to have books related to a subject in the same folder…
Sometimes ? Please clarify this. At a glance, this appears to be something tags could address. For example, I may have a PDF document about kayaking. I tag them boating, outdoor activities, and exercise. One day, I could be thinking about boating and search for tags:boating and find the document. A week later, I’m thinking I need to exercise more and search for tags:exercise and find the same document.
And as @rmschne suggests, you can even save these as smart groups, like Exercise or Boating docs.
When you export, the exported file is a copy of the file in DT. Any editing changes made to one (the original in DT or the copy at the Finder) are no longer synchronized with the other (the copy at the Finder or the original in DT)
When you index the file, any changes you make in either working environment, Finder or DT, will be syncronized.
Importing and indexing the same file is a way of trying to have your cake and eat it too. Except that DT will allow you to do this. And then, at some later point in time, you may have to decide whether you just ate the original and kept the copy, or you ate the original and kept the copy. Or both. Or neither.
Yes. Decide first why you want to do this and whether you really must do this. See also the discussions at this thread. And search for phrases such as convert import to indexed.
It’s not impossible to clean up duplicates and switching to an indexed database doesn’t alleviate them either.
To deal with the duplicates, you can:
Enable Settiings > Files > General > Stricter recognition of duplicates so you’re looking at file-based copies, not contextual ones (ones where the content is very similar).
Select the built-in Duplicates local smart group in the database.
Select a pair (or more) of duplicates for a document.
Choose Scripts menu > Data > Move Duplicates to Trash. The last copied or created document is preserved. The others are moved to the database’s Trash.
If all is well with it, you can select all the documents in that smart group and run the script to clean them up in the same way.
When you’re done, you should be emptying the database’s Trash (and doing that routinely), as the trash is still a location in the database.
PS: If you deleted the Duplicates smart group, it’s trivial to recreate.
In the database, select Data > New > Smart Group and add these criteria…