DT vs Dropbox selective sync

First of all, thanks for building and maintaining such an amazing tool!

As a scientist, I find DT extremely helpful in organising the various documents I deal with. However, for literature I still use Bookends which is better at metadata retrieval and referencing. PDFs that are added to Bookends are then placed in a Dropbox folder. After some 20 years, this folder is now nearly 9GB in size. I therefore use Dropbox “selective sync” to only keep newer files locally.

I’ve previously added the entire “PDFs” folder into DT as an external reference (“Index Files and Folders…”). Now that my PDFs are indexed, they are searchable, which is awesome. However, my problem is that I now want to make some space and make a lot of older papers “online only“. However, it seems that DT immediately tries to re-index the files as soon as they’re replaced by a 0 byte placehonder, which triggers a re-download of the files!

Is there any way to stop re-indexing of the files while still having new files added to the “PDFs“ folder indexed and made searchable?

Welcome @BenjaminSB
Indexed files should be available locally, not stored only in the cloud. Have you done anything beyond searching for the indexed files in DEVONthink?

The beauty of Dropbox selective sync is that the files are sort of stored locally, but only retrieved on demand. This used to work quite nicely: after being indexed, the files eventually get replaced by a 0-byte placeholder, but if I try to open them they are transparently re-downloaded. However, probably due to the switch away from AFP, it now seems that DT is notified of the fact that the files have changed as soon as they’re “offlined“, and then it tries to re-index them, which immediately re-downloads them. So, at the moment, it seems impossible to use selective sync with a folder that is indexed in DT :confounded_face:

Is it asking too much if there’s a way for the indexer to selectively ignore “file has now size 0“ changes in indexed folders that are in Dropbox? That would solve the issue, I think - as soon as I try to access the file any other way than indexing would immediately re-download the file, so it would continue to behave as if it were locally available.

The beauty of Dropbox selective sync is that the files are sort of stored locally, but only retrieved on demand.

Except when Dropbox’s servers aren’t responding or your network is poor or down.

Have you also considered not indexing that entire Dropbox folder?

Lastly, what version of DEVONthink are you running?

The point of indexing the folder is to be able to search through the documents without also keeping the 8G of data around when they’re not needed (ie most of the time). Sure, there’s a possibility that I might be offline, but in that case I just get a timeout or temporary error, that’s a small price to pay.

I’m on DEVONthink 3.9.16 Pro.

There’s nothing to be done unless you move up to version 4 (which we’d recommend for many reasons).

This is a search for an indexed file on OneDrive in version 4:

Oh, somehow the move to 4.x passed me by. Is there some fundamental difference in how DT4 handles this scenario?

I should also not that I’m still on MacOS 14.8.2 - I see that DT4 is compatible, but I wonder if that affects this sync/indexing issue in some way.

Yes. Version 4 has specific controls to not download such files. It requires you to download them on demand.

And no, there is no issue running version 4 on Sonoma.

Oh, somehow the move to 4.x passed me by.

From June of this year:

Right, I just upraded to DT4 but I still get the same result: as soon as I select some files in finder and make them “online only“, they are immediately re-downloaded as DT4 tries to re-index them (thus accessing them which triggers the re-download)

As I mentioned…

Version 4 has specific controls to not download such files.

Disable Files > Preview > General: Automatically download cloud files.in the settings.

1 Like

I disabled the “Automatically download cloud files“ tick-box and closed DT4, then made the whole “PDFs” folder “online only“. When I re-start DT4, it immediately starts to re-download all files as DT4 re-indexes them. Not sure what I’m doing wrong…

If the documents weren’t indexed, it would make sense as DEVONthink logically can’t index online only files. Let it finish, then make them online only. Quit and relaunch DEVONthink and see if the issue recurs.

Sorry, I think I was being unclear: the files were fully indexed. I then quit DT4, made them online only, and re-opened DT4. They get immediately re-downloaded. I just tried it again, same result.

Just for reference, I added the PDFs folder via “Index Files and Folders…“ and selected it. I’m using Dropbox for Business, so the full path to the folder is something like $HOME/My Company Dropbox/My Name/Apps/Bookends/Attachements

I’m not running Dropbox here. Are you using the Dropbox desktop app or the Finder?

Not sure what you mean with “the Dropbox desktop app or the Finder“? I use the Dropbox app, which runs a daemon in the background to sync files in the “Dropbox“ folder on the local machine to the cloud. Is this not what you were referring to?

UPDATE:

I think I might have figured out the issue. My dropbox folder hasn’t been upgraded to File Provider API. With Business accounts, it seems that an admin needs to allow this to happen, but for some reason our IT person forgot to do so. So, the problem here is that Dropbox is still using some old API to watch file changes and insert itself into the directory tree, which is probably why DT4 can’t detect that it’s a cloud file.

Thanks for the update. If IT gets this sorted out, let us know if that resolved the issue in DEVONthink as well.

I set up a second computer (M4 Mini) with the exact same Dropbox account. The new computer wouldn’t switch to File Provider which meant some indexed folders I was using to funnel new files into specific databases would confuse DEVONthink.
It would’t work properly with one on FP and the other on the old way, which makes sense as the file path is totally different. I had to switch the new computer to the Dropbox Beta to force it to eventually switch over. Or at least I did the switch to the Beta and then it changed.