Devonthink interface locks up when indexing/scanning

frans · February 24, 2020, 7:18pm

First off all. I’m quite happy with Devonthink. Most of the time, and when I’, not happy I end up on this forum. So as not to come across as grumpy, most of the time I’m happy :-).
Second I thought I posted this before but it seems as if it is a message instead of a topic?

And now the challenge: I created a database with indexed files. It is an archive and the idea is to let DT3 do its indexing and searching and when I need something I copy it to a different database. (All my databases are not indexed. Files are inside the database.) There are 427149 files in that particular database. That is why it rests on a NAS (Synology with 8TB 7200rpm disks. Gigabit cabled network. Macbook pro with threading, 8 cores and 16GB memory. So the hardware is not slow) DT3 has indexed the files. The problem is that if I open DT3 it starts indexing/scanning the database on the NAS AUTOMATICALLY again, which takes forever. It also shuts out the interface of DT3. Why should it do that if nothing has changed? If my computer wakes up it starts indexing. Or tries to because the NAS obviously is also on sleep mode. DT3 does not recover from that. The wheel keeps turning and turning and you can’t do a thing in DT3. So force quiting DT3 has become habit.

Could you please make indexing/scanning for databases on a networkconnection configurable (as in per database. I know which one is on a nas and which one not. The software not)). I need the automatics switched off. Only perform on user request. That way I can do my work and when I’m finished and go for a lunch break, fine I start the reindexing/scanning. Before DT I worked with another product which had that option. It is sorely missed.
(And do you really only use only one core as the activity apps suggests? Why not a core per database? That way I can keep working while some index is running. The index/scanning is not the problem. The fact that DT interface is dead is the problem.)

So this is more of a feature request I guess…

BLUEFROG · February 24, 2020, 7:41pm

Putting databases on an NAS is generally not recommended since the performance is often less than stellar.

Also, your database has surpassed the comfortable limits of a database, which may be compounding the problem.
A connected external drive would likely produce better performance, though you should consider your database construction as well.

And of course, Development may wish to comment on this as well.

frans · February 24, 2020, 8:13pm

Thanks. Performance seems not to be the issue here. Internal disks are too small on mac’s (or the big ones much to expensive. Same goes for TB3 disks.) I have a second database, which also has the database on the NAS (this one has not. its 45Gb for the indexed files which is indeed bigger than other databases which have the contents in the database. I have one which is 77Gb with content on the mac internal disk.)). No problem with the datab ase on the NAS.
I do have a USB 3.1 gen 2 raid disk on my mac (meaning 10Gbit) but alas this is also to small. I use it as a syncstore (stellar performance as you put it. I can always put in the disks from the NAS but that is gonna be a lot of work copying and moving.).
What I can do is chop up the database and still leave it on the NAS. The idea is to scavenge useful stuff and use DT3 for the cherry picking. Rather than structure everything. I found that stuff I’ve written in the past still can be useful a couple of years later to use as a starting point for something new.
But if the interface would keep working for the other databases and only be slow for the one which is indexing I would be happy. (If I had started to use DT earlier there wouldn’t probably be a problem There would be structure and means to find things.)
Thanks for the honest reply. I do think the software performs better than you seem to think yourself.

frans · February 24, 2020, 8:27pm

Second reaction. With regard to database design. I have two tiers of databases. Databases for daily project use and databases for use as an archive (eg administration). The first tier is on the internal mac disk. The second tier need not be.
And then there is tier 0 with scrivener of course.

ksandvik · February 24, 2020, 8:33pm

Reminds me of the Aperture days with photographers having tera-byte databases and complaining about speed…

If doable break things up into smaller manageable parts, long-term vs medium-term vs short-term databases and so on.

frans · February 24, 2020, 9:53pm

Well my “normal” databases are uhhh well they are actually designed. This one is just a big bunch of files grown over the years. Something you let a computer sort out. Not me in my own time. I’m not complaining about performance. I make a point of how DT reacts. Even if I have something eating up all resources on my mac (which is hard to make happen. Something like compiling ffmpeg from source will do the trick) the interface still keeps responding. For my other databases DT works just fine. That indexing takes time does not bother me, Let it run, in the background. But why should it affect the interface response for other databases? That I do not understand.

And if DT3 had a mode where you can edit de metadata of a file while the file is not actually there (like lightroom can do) I can make selections and extract smaller databases with the real files later on. Maybe it has that mode, I just have not found it or found a way to simulate it. I bought DT3 after I first made a kind of prototype application to do what I wanted in filemaker Than found out that I liked the tinkering to much and that filemaker, is not like programming how I’m used to it, with an ascii editor (i started with ked). (It’s a bit cumbersome (or should I say different with a steep learning curve :-)).Anyway to big a time investment while I should be sorting out the data Finally found the proper words to search on the internet and found Devonthink. Which can do a lot I hadn’t thought off but which comes in handy. So I do like the product and have no reason to complain. I’m aware I’m scraping the boundaries of the designparameters.

cgrunenberg · February 25, 2020, 8:23am

According to your first post the database contains only indexed items. In this case the actual database (*.dtBase2) shouldn’t be very large usually.

frans · February 25, 2020, 10:37am

You’re right. I looked inside the package and found two dirs named backup with a timestamp corresponding to times I did a sync to a synstore. E.g. Backup 2020-02-24 09-55-31. So the dt2 file is three times the size it should be. How did that happen?
I did verify and repair on all the databases after the last force quit of DT. That did not solve it probably. I’m now doing a rebuild of the database with the funny dirs to see what happens.

cgrunenberg · February 25, 2020, 10:42am

The Backup folders are automatically created (and also by File > Optimize Database…). How large is the dtBase2 package actually according to the Finder?

frans · February 25, 2020, 10:45am

62,57Gb backup one is 23,92Gb backup two is 19,31Gb. So that leaves 19,34Gb for the actual database. Which is 7.3% of the 262,09Gb which is indexed.

cgrunenberg · February 25, 2020, 10:51am

And how much disk space is still available on the internal drive?

Indexed folders located on network volumes are automatically scanned after opening the database or after mounting the volume (as these volumes don’t support filesystem events). Without any changes this should be relatively fast.

Therefore could you please launch Apple’s Activity Monitor application (see Applications > Utilities), choose DEVONthink 3 the list of processes, select the menu item View > Sample Process while (!) DEVONthink 3 is still scanning/unresponsive and send the result to cgrunenberg - at - devon-technologies.com? Thanks in advance!

frans · February 25, 2020, 11:12am

Will do. But now the rebuild is running. I have done as you asked but DT is consuming just 25% of 1 core. (Unresponsive though) After that is finished I will close and open DT3 and check activity monitor again do as you asked. The file I now have is for the rebuild. Also of interest for you?

cgrunenberg · February 25, 2020, 12:23pm

No, only a sample while scanning is necessary.

frans · February 25, 2020, 8:30pm

Well DT3 is still churning away at the rebuild, CPU time is 7 hours by now, the NAS disks get a good spin :-). The mac functions as an extra heater for the room. It’s wintertime so it comes in handy. DT3 interface is totally not usable. So the results of a scan are in the waiting.
The adage is search and you will find: DisableAutomaticUpdatingOfIndexedItems. If I understand english correctly; I guess this is what I was looking for. It’s configurable behavior. Yo! And now would there be a command to launch a process to do the indexing of a database in a seperate launched shell with it’s own resources? So that I can keep on working with other databases? CLI present? Nothing found. But a unix program without a CLI?

As for database design. In this instance I’m interested in the concordance function of DT3. Let the computer do the repetitive work. All I have to do is browse through the endresult.

ksandvik · February 28, 2020, 8:04pm

Sample process is the best to see where the app is spinning around, if there’s a deadlock or otherwise expensive iteration happening, or a thread starving for I/O and so on.