Excessive use of iCloud Storage

See also DEVONtechnologies | Understanding Database Sizes

I’m not sure I understand your point here. The size of the other database matches the size in Finder, so there is nothing to investigate. What I’m trying to figure out is what can possibly produce a 10 GB overhead in a database containing (at most) 14 GB of actual data.

What comparisons are you referring to here? I provided the screenshot you asked for, so I was anticipating that you would be able to glean some relevant information from it.

Thanks for that. On that page, this might be relevant:

on AFPS volumes a file copy uses no additional space until you change one of the two copies.

Because I do have 5518 duplicates in that database. I have not yet gotten down to figuring out how to fix that because the duplicates are mostly duplicates of the exact same file, so it is unclear to me how I can remove one of them:

Could it be that these duplicates are causing some of the extra size of the database?

I don’t see how they can account for the full 10 GB, though because when I select all the duplicates, right-click and select “get info”, those items show as using 1.4 GB.

Right click on it and move it to the trash from the context menu. Empty the trash. But you’re probably asking something else? I’m not getting it though: If you have two identical versions, you remove one and you are left with the other. As they are identical, it doesn’t matter which one you remove.

1 Like

I’m confused as to what’s confusing you, and I suspect others might be as well.

In your first post, you were querying why DTTG was using 35GB of space.

In your recent screengrabs, you’ve shown a database that is 25GB.

You’ve said you’re syncing two databases, so that means your second database and the corresponding sync info for both databases takes the remaining 10GB.

I’m puzzled as to what isn’t adding up?

3 Likes

The size of all indexed & imported files in the Work database is 23.3 GB, not 14 GB. The Finder size doesn’t include indexed items and obviously there are a lot of indexed items in the database, e.g. the duplicates.

At a very basic level, the confusion here is that it turned out to be not DTTG that used that space, but DT3. My question how I can delete only the DTTG data from iCloud was never answered, but I figured it is simply not possible: You have to remove all the data and then reupload the DT3 data.

The puzzle that remained was that the databases in question use 22 GB of disk space but 33 GB on iCloud.

I learned that the size shown in Finder was inaccurate because it does not include files that are not in the database package but are still synced to iCloud.

It turned out that the size of the Home database in finder was more or less correct so that virtually all the unexplained extra space is used by the Work database, which is 13.68 GB on disk but 23.3 under “Database properties”.

So, the puzzle now is where those extra 10 GB come from, because I found it hard to understand how a 13 GB database could generate 10 GB of overhead. I was asked to provide a screenshot of the database properties and I assumed that the purpose of this was to find an explanation for what is using the extra 10 GB.

I found the reply to the screenshot confusing because it did not contain any such information. Instead, it seemed to suggest that the 10 GB difference I was trying to understand either didn’t exist or wasn’t huge. I still don’t understand what @cgrunenberg was trying to say there. He also said that “neither the size nor the item count of databases and sync stores are actually comparable”. This is confusing to me because I have not said anything about “item count” and I am not sure what an “item count of databases” is. Is it the number of items in a database (probably) or the number of databases (probably not)?

Or am I misreading the sentence and I should read “item count of databases and sync stores”? But that doesn’t really help either, especially since I am interested in storage use, not items. He does also mention size not being comparable, but comparable to what? Hence my question: “What comparisons are you referring to here?”

Can you see where the confusion comes from?

Yes, of course, it adds up. I have already acknowledged that the mystery no longer is the amount of storage used in iCloud but rather what the database uses so much storage at all.

There are two dimensions to this puzzle

  1. why is there so much “overhead” (10 GB)
  2. why does the content of the database use so much space, given that the majority of the data consists of external files.

In order to keep things simple, I bracketed the second puzzle and focused on the first.

So, the size of the database as shown under “Database properties” includes the size of external files, i.e. files that are indexed but not imported into the database? Or what do you mean by “indexed files” (my understanding is that also the imported files are indexed, technically speaking, but since you say “indexed & imported files”, I assume “indexed” refers to external files.

Unless I am misunderstanding what you mean by “indexed files”, that means that files that I deliberately did not import into the database still get synced into iCloud, is that correct? I find that very counter intuitive. Isn’t the very point of indexing external files to not include them in the database and to not include them in sync operations?

Did you read the documentation section on “In & out” and the gazillion posts here dealing with the difference between importing and indexing (in the sense DT uses “indexing”, not the general database term)? That might shed some light.

Also, indexing files in a database on one machine and then not making them available on a synced machine, too, calls for a lot of support requests. I agree that it might look “counterintuitive” to upload indexed files to another location (and I’m not sure at all that really happens), but I know that people get uncomfortable if they can’t access all their data on synced databases.

Personally, I never bother with database size. But then I don’t use iCloud to sync either.

That’s right.

The only real difference is where the items are stored (in the database package or externally) but they’re handled exactly the same way.

However, you could disable synchronizing of indexed contents (see options of sync location) but in that case you won’t be able to view/edit the documents with DEVONthink To Go (and on other Macs only if the indexed files are transferred e.g. via cloud folders already).

In the end the easiest option not to synchronize data is to add it to a database that doesn’t get synchronized.

1 Like

Of course. That’s how I know about the difference between external and imported files.

What I had not read, however, is the section on “Indexing and sync”, which confirms your suspicion that indexed files are indeed uploaded to the sync storage but it can be turned off in the sync settings for each storage by unticking “Synchronize content of indexed items”:

It did not occur to me that people who use indexed files would be surprised when those files are not accessible on other devices but I now understand that that is the reason why those files are included in syncs by default. (IMHO I think this default increases the confusion between indexed and imported files, but I suppose the developers know best what generates less support.)

So, I have now disabled syncing of content of indexed items and I assume that will significantly reduce the size of the database in iCloud (I can’t see any effect at the moment, but will wait for things to take effect)

Once that is sorted out, I’ll take a look at the duplicate files and try to remove those.

Well, in my mind, identical means identical, i.e. there is only one. In that view, if I remove one, “the other” will also be removed (because they are identical). As you know from the documentation, deleting items in a DT database is a highly complex matter, so you really need to make sure you know what you’re doing and which scenario applies in your specific situation. This is why I left the dupliactes alone so far. As I mentioned they only consume 1.4 GB.

Good to know.

But in that scenario I will not be able to search for these files in the database. One of the main reason I am using DT is to have a place where I can search across all my files. If I get a hit in an indexed file that I can’t immediately access on my current device, at least I know it’s there and usually there is a different way of accessing that file.

That would be a replicate. Of which you could also remove all instances but one. As you’re reading the documentation anyway, I’m not going to repeat what it says on duplicates and replicates here.

2 Likes

I have now done that and waited half a day or so but DT still uses the exact same about of iCloud storage.

What am I missing?

This does not immediately affect the sync store and the data already uploaded by this or other devices. You would have to clean the sync store first, ensure that all Macs use the same settings and then upload your database(s) again.

1 Like