Support for Cloud storage

I think I know where this is going before I even start, but I feel the need to ask all the same.

Some of my databases have become so large that I can’t keep them all on my internal MacBook storage. Carrying around an external drive all the time is not practical, it defeats the value of a laptop. Apple have recently made signifiant enhancements to how it cloud storage is managed in macOS.

It would be great if DEVONthink could keep indexes, metadata and X% (user configurable) of the most recently accessed documents (or returned in search results) on local disk, while allowing the rest to be handled by cloud providers. In the event a user requested a document that wasn’t available locally the file would be retrieved from cloud storage and presented to the user. If the user is offline they would obviously get an appropriate message.

I appreciate cloud storage is often a mess, with cloud providers being inconsistent in their implementation and comms errors can make DEVONthink appear unreliable, but with Apples recent enhancements and the general trend towards cloud storage, having this as an option for would be a big plus for many DEVONthink users.

Most cloud storage, including Apple’s (which many people find unreliable, by the way) is a sync service. As you should use “offline” files with these services to enable DEVONthink to work, all the files reside on both the local disk and on the cloud server. Doesn’t achieve very much. I would not trust using “online” or as Apple calls it “optimize disk space” to work.

Perhaps just get a laptop with a bigger disk?

Or, maybe I’m missing a trick here.

I’m unaware of enhancements, really. But as reported here for DEVONthink and for many other applications, Apple’s sync services are often unreliable for some.

1 Like

It seems what you’re really looking for is a network file system like CIFS or NFS. Which is not directly related to “Cloud”.

1 Like

Welcome @MrSkooby!

Carrying around an external drive all the time is not practical, it defeats the value of a laptop.

I would politely disagree with this. I have a 1TB Western Digital SSD that could fit in the pocket of your jeans! External drives are small, sizeable, and generally affordable nowadays.

Here’s a Crucial 1TB SSD as an example…

And the WD 1TB drive I have…

Apple have recently made signifiant enhancements to how it cloud storage is managed in macOS.

We’ve heard that tune many times before and the fact is their enhancements are meant for simple, small, single file transfers. The mechanisms in cloud services are not made for syncing package files, especially ones as large as DEVONthink databases often become. It is not and never has been data-safe to put your databases in any cloud-synced location. To do so, is to invite damage to them. In fact, DEVONthink won’t allow you to open or sync databases that are detected in cloud-synced locations.

in the cloud is no panacea. Apple and other services have long touted "your data everywhere. But the practical truth is it’s marketing hype, not fact. Don’t fall for the Myth of Persistent Connection. My parents live in a fairly metro area, yet I can drive five minutes away and lose cell signal. And working remotely, I can assure you networks aren’t always available or robust, especially not to access a large volume of data online. Sometimes I can barely stream funny cat videos on YouTube :wink:

5 Likes

Yes, before getting a new laptop, a small external drive is doable–at least I think so. Here is the one I carry (for backups, not DEVONthink though).

IMG_2667

1 Like

Nice! I’ve been looking at those, but Best Buy is already getting rich off me and I have too many external drives! :stuck_out_tongue:

I guess its horses for courses. I might use 4 or 5 different desks in a day and 2 or 3 meeting rooms. USB external hard drives are non-starter for me on a practical basis. Let alone the fact that my employer prevents us from plugging in any USB or thunderbolt devices (although I accept that is probably a limitation most people don’t face).

At the moment I’m using a 200GB battery powered Sandisk with WiFi in my laptop bag. It works reasonably well, so long as remember to keep it charged. The only problem I’m having now is my employer is cracking down on unauthorised devices on the network.

Buying a new MacBook just to have a larger drive ? I think my expenditure on MacBooks would exceed my salary given how fast my databases grow. I’d be better off retraining as a plumber.

From my perspective, the size of the database on the cloud service shouldn’t matter. So long as DEVONthink knows that data written to a specific location is cloud storage, and that DTP indexes it before the file is allowed to go to cloud and the DTP can manage the movement of files between cloud local storage. You should only ever need to retrieve that specific file from the cloud when your search results suggests it contains the info you’ve searched for. e.g. 1TB DB stored 80/20 cloud/local and I want to retrieve a 500KB MS Word file from cloud shouldn’t be a huge challenge or bottleneck. No more than retrieving the same file directly from iCloud/Box/OneDrive/Dropbox etc. To be honest this isn’t hugely different to what DTP is doing at the moment with CloudKit for syncing,

For me I’m typically connected to either home/office/customer network most of the day so performance isn’t a major issue/concern.

I’m not suggesting that cloud storage is a match for all use cases and users would obviously get a chance to decide if they wanted to use it at all as well as how much/which type of documents etc are allowed to migrate to cloud.

But certainly for me DB sizes are becoming a major issue.

Well, harness up your horses and try. Keep good backups. I contend you are not picking a good solution compared to alternatives, but completely up to you.

I’m not suggesting this is what I’m trying to do. This is a feature request for DEVONthink to consider. But thanks for your contribution.

“I think I know where this is going before I even start, but I feel the need to ask all the same.”

and I want to retrieve a 500KB MS Word file from cloud shouldn’t be a huge challenge or bottleneck. No more than retrieving the same file directly from iCloud/Box/OneDrive/Dropbox etc.

I get what you’re saying but you are describing how you imagine things work, not how they actually do with DEVONthink databases and the underlying mechanisms.

To be honest this isn’t hugely different to what DTP is doing at the moment with CloudKit for syncing,

It sure is. Syncing doesn’t store any of your files in any sync location. That’s not how sync works.

At the moment I’m using a 200GB battery powered Sandisk with WiFi in my laptop bag. It works reasonably well, so long as remember to keep it charged.

The drives I linked are bus-powered.

The only problem I’m having now is my employer is cracking down on unauthorised devices on the network.

As a previous IT person, I understand enforcing security policies. However, if your use of DEVONthink is approved on the machine, I would ask for an security exception for an external drive. If you’re doing company work on company equipment, the case should be fairly easily made. Also, the company would potentially foot the bill.

2 Likes

it seems folks are missing the fundamental point of the question. yes, as currently used many cloud “drives” are sorta kinda sync drives. But there are cloud storage solutions such as AWS and Azure for example. if one wishes to store offline [cloud or small SSD] and is understanding of the potential lack of high speed access if that is even a factor, why not use cloud storage. it is potentially unlimited.
I happen to have at least 2tb internal SSD’s on all my computers, yes I have more than 1. and even 512 on my iPads so storage is not an issue. As high speed connections and 5G become more reliable and widespread, at least in N.Am. and the EU then an option should be cloud storage. carrying around hard drives is somewhat barbaric.
Dont get stuck like in the days where Sneaker–net was the way to go. : )

1 Like

Agree. OP was talking about Apple cloud (sync) services which are for many unreliable.

1 Like

To be honest I’d like to see DTP embrace Apples cloud service APIs which if the cloud service provider implement their side correctly means DTP doesn’t need to know which cloud service is in use. Regarding CloudKit it fully supports storing files for retrieval. It would be just a question of how DTP wanted to manage that. Similar to how it can repair databases from CloudKit.

I’m well aware that DTP doesn’t support cloud at the moment. I’m well aware there would be many difficult hurdles to overcome so that it could. But if you go back a few years we didn’t have production electric cars or even cell phones. People said “In the future wouldn’t it be great if ….”. And some people said that’s never going to work. But some people took up the challenge, and as a result this incredible tech is part of our everyday life.

As feedback and a feature request I’m stating that more cloud support in DTP would be appreciated. Yes you can always find work arounds to issues. Similarly I could ride a horse to work. And yes unlike many teenagers I could survive without a cell phone.

But life is largely better with them.

2 Likes

I might find a hole in this argument. When your employer demands you to be able to use so many desktop machines across so many meeting rooms, and when your employer sanctifies if not requires you to use Devonthink, it seems to me that the onus should be on your employer to fix the issues that are limiting your effectiveness with the software. Which in my mind seems to me to mean …

… having your employer provide you with authorized USB or thunderbolt external drives. In addition …

… probably should instead say “Requiring my employer to provide me with a machine that has a larger internal drive”.

Having said all this, perhaps you should consider ways to can slim down your databases. One approach that I would consider is to make smaller components that fit together. Another is to extract a “snap-shot for this meeting” database. The relevant or most relevant database would reside directly on my internal HD or SSD. The other components or larger database would reside on my (company sanctioned) USB/thunderbolt external drive.


JJW

1 Like

I will also politely disagree with this, especially about the “more reliable” part. Even on 5G, reliability is an issue and will continue to be for a long time to come. Not everyone lives in large metropolitan areas, and even those have issues as well.

PS: This isn’t a Luddite-speaking either. However, I have far more practical experience than many people when it comes to network reliability, availability, etc. I could cite many examples of poor networking in major metro areas all over the US.

Dont get stuck like in the days where Sneaker–net was the way to go. : )

When sneakernet proves to be more reliable, it’s definitely the option to choose. Just because something is an older approach, it’s not invalid. And newer ideas are many times inferior - though they of course can improve.

1 Like

Undoubtedly. But these are, as you rightly say, “storage” offerings. They do not include file system access to the data. To be able to work with it as you do now with your local data in DT, you’d have to download the file. Which goes against the whole idea to save on local disk space.

I’m repeating myself here: Cloud storage is only useful for DT and similar programs if the data are accessible through a file system layer like CIFS or NFS. To put it bluntly: If you can mount your AWS data in Finder, you save on local space by using cloud storage. If you can’t mount it in Finder, apps have to download the stuff before they can work with it.

AWS does offer CIFS in their cloud. You can check out their prices here

Personally, I find 322 USD per month to have 1TB of data available in the cloud a bit steep. YMMV, of course.

5 Likes

Part of the issue is you’re talking about fundamental changes in DEVONthink, not incidental / trivial ones. So it’s not a matter of “hurdles”; it’s more like mountains.

PS: I hope my comments don’t come off as dismissive or brusque. I’m just talking about the realities of development, the technologies you’re describing, and the very broad environments our userbase deals with.
You also have to consider the increase in support that would come from people trying to leverage what you’re proposing but in less than ideal situations.

This engagement from the team is one of the things I really appreciate about DEVONtech.

3 Likes

Hi everyone, this is my first post here. I stumbled across this threat and hope to have understood it and maybe give a possible solution.
In my scenario, I have many larger databases, and I have saved the actual documents (many, many PDFs) on a NAS. Only the DTP databases are stored locally on the Mac. I’ve tried the following. I copy a few files into a directory of my cloud provider (Strato), preferably via WebDAV or SMB. Then I create a new database in DTP and only index the files that are on the cloud storage via the Finder drive. The search for something is then always done in the local database. I can open a document as long as there is a connection to the cloud. If it fails, DTP indicates that the path is not available.
So I could imagine that it would be possible for you to move all documents that are probably stored in your DTP database itself manually within DTP to the “cloud” database, since then the document (file) is in the directory is moved to the cloud storage folder, but the indices remain on the Mac within the DTP database.
I tested this on mine and it works. As long as I have the online connection via SMB in the Finder, I can access everything. When I’m offline, DTP shows Volume not mounted.
So far, however, this has one disadvantage. DT to Go cannot access these documents as it needs a sync store.
I hope that my suggestion can help a little.

greeting
The medium

You could use an automount program to ensure that. “Automounter” is available in the App store. It has a quite basic UI but interesting options that allow you to mount differently depending on (for example) network setup.

1 Like