How to create and maintain a General Index?

I was wondering, how can I create and maintain a General Index? So one large index that contains titles, keywords and meta data of all databases I use with DT? Even when they are closed / offline.

In my case I’m using the general Inbox to collect data, also synced from DTTG. After that I tag and organize the data, mostly on my desktop Mac Studio. This involves moving items to Project specific databases and Storage Archives (encrypted) for backup purposes (duplicated / copied to cloud storage).

Even though not all databases are available locally or active / opened, I would still like to use DevonThink search to find or reference the objects stored within those databases. Is this even possible? And if so, how should I set things up?

If not, could this be a feature to include within a future update? I think it would be very useful to get information about the database that should contain the referenced file and the last time DT was able to index it’s contents and other information.

Regards,
Maik

Enabling the Spotlight index in the File > Database Properties should be sufficient to search in closed databases.

I am not sure this works. All my databases have spotlight index selected in database properties. When one of my databases is closed, it does not find an item I know is in it (DT search field, all databases). I have tested several times. Any suggestions, as like the OP I would like to find items if a database is closed?

The Spotlight index is only for Spotlight, not for DEVONthink.

1 Like

And that’s exactly what I’m looking for, searching for items (basic info) within DevonThink. Finding and/or linking them, even when the database containing them isn’t open.

It shouldn’t be necessary to have all your databases open all the time. I thought it wasn’t even recommended to have too many databases open.

But if I can’t find my information when a database is closed then I would still have to remember when and where I did store it. It also reduces the chance of bumping into information and having that spark of insight.

If it isn’t a feature (yet), does someone have a clever workaround…?

Why does it bother you how many databases are open? I have most of mine open all the time, although I need only three of them regularly. Saves a lot of clicks.

In my opinion, what you want makes little sense. Either you need your data, then the database must be open. Or the database is closed, then you can’t use your data. That’s not something DT has invented, btw, it is all over the place – from Oracle over DB2, Informix, MySQL/MariaDB to SQLite – that’s how databases work. Open – usable. Closed – not usable.

2 Likes

Not true. An index can be totally separate from the actual data store.

When you want to gain access to the related information it opens a connection to the storage location, as stored inside the index.

It’s depending on the type of solution or implementation how rich / large the index is.

For instance, Elastic offers Company Search. It basically creates a (large) search index, but leaves the actual documents or other information in their actual locations (DMS, KMS, CRM, Wiki, etc.).

The reason I’m looking for such an implementation for Personal Knowledge Management, is that some of my databases containing files that are quite large.

It’s not. Not in RDBMS, nor in DT. The latter stores its index with the data (i.e. in the same package). And it’s not usable outside of DT, which in turn will not use it unless the DB is open.

That is a client server system on an entirely different architecture, with a different price tag, and a different audience. Apples, Oranges.

Why is that a problem? DT is not loading the files into memory unless you open them. Opening/closing a database is a logical process, it does not push all the content into the RAM of your machine.

I just tried it out (and maybe you should do that with your data, too): The difference in RAM usage between an open and a closed database in DT (a small one with about 1400 objects and 452 MB) is 9 MB. I couldn’t care less.

1 Like

Then that’s your answer: not possible in DT.

So I have to look for another solution. Thanks.

To DT devs: consider a (separate) index, just like your default Inbox, that doesn’t rely on the availability of the actual repositories containing the files. So DTTG or a second instance on a laptop is able to find them. Or does the Server edition offer this kind of setups?

@cgrunenberg already suggested using Spotlight for the behavior you desire.

No. My answer is: Not necessary. You’re a proposing a solution for a problem that is not there.

1 Like

The problem is there, but it seems DT isn’t the solution.

Then I didn’t understand the answer: does Spotlight retain indexed data of ‘closed’ DT databases? And if so, can or should I copy references to files in DT from within Spotlight? As far as I know I cannot use the Spotlight index within DT, or can I?

does Spotlight retain indexed data of ‘closed’ DT databases?

Yes.

And if so, can or should I copy references to files in DT from within Spotlight?

No. The Spotlight results are not the files themselves. They are bits of metadata about the files.
You’re asking about two different things now.

As far as I know I cannot use the Spotlight index within DT, or can I?

No. Spotlight is a separate process. And why would you need to?

Yes, I do understand the difference between actual files and an index containing bits of data.

That’s exactly why I’ve asked the question if and how maintaining such an ‘overall’ search index is possible within DT itself, so I can make use of the capabilities inside the application itself (and DTTG).

It would also benefit a map or tag view, without the need to have all databases open and therefore on every local machine (laptop, iPad, Phone) or accessible NAS.

The request is noted but you’re asking about pretty specialized behavior.
Also, the implementation is unclear. DEVONthink does not maintain a global index. Each database has its own. Databases sync; DEVONthink doesn’t. So you aren’t asking about a simple additional feature here.

4 Likes

Yes, I can understand that it seems like a big request if this use case isn’t part of and/ or supported by the current version of DT(TG).

Maybe the current feature and codebase for ‘index local folder’ would be a good starting point. I think it mainly does what I’m looking for, except that it doesn’t always responds properly to moved or deleted files. In that case manually rebuilding the index solves the problem.

But for now I have to think of another solution for my particular problem or use case. Maybe a separate indexing solution, because most of the time my file names are descriptive and contain the date.

This is unclear. reference them in what way?

So use Spotlight to find things in closed databases, open them when actually needed, then close them again?

2 Likes

Aren’t there lots of times that you do not want to search everywhere? Maybe you have work vs home databases or a database just with immediate family info or whatever other scenarios.

1 Like