Will the new M2 MacBook Air deliver the performance I need for my DT3 database?

cubicray · June 19, 2022, 11:40am

After reading all the comments and doing some more research about the M1 and M2, I think going with the M1 with more RAM is the smarter choice.

I decided to sell my Pro Display XDR so I can afford the most RAM. If anyone is interested in a brand new Pro Display XDR feel free to contact me for more information. It’s the nano-texture version and comes with the VESA mount adapter. Everything is still brand new and in its original packaging. I never had a chance to use it yet. The one year warranty has expired though but you can always get a third-party warranty. It’s located in Spain near Granada. With the VESA mount Apple sells it for €6718, but of course I’m prepared to go much lower than that.

Thanks again of your input everyone!

rmschne · June 19, 2022, 11:49am

I think you may find that the DEVONthink database depends on the file system (each file resides in a file in the file system). I think your issue may be having so many Groups which is unusual and not using Groups as intended in DEVONthink (as others have mentioned). Maybe not an issue, dunno. Just seems very unusual. There are so many ways to do automation (Hazel is outstanding as a first step) with files in the file system. But without knowing and thinking more about your situation I can’t say. But as it is just doesn’t feel right. Just my two-bits.

cubicray · June 19, 2022, 11:59am

I’m very interested in improving my current system and your thoughts give me something to think about. At some point in the future I might share a video of my setup and workflow here in the forum to learn more about how things could be improved. Like you said without knowing more about my current setup it’s difficult to think about it.

JanSalomonsson · June 19, 2022, 7:54pm

This must come down to either a very bloaty database structure or legacy OS issues. I’m running much larger databases (in terms of bits and bytes) concurrently on an 8GB M1 Air. Zero issues.

3865 · June 20, 2022, 6:14am

Just in case anyone is interested, Howard Oakley has written an extremely informative series of articles regarding M1 (and now, M2) Macs, including discussions of M1 memory management with comparison to Intel Macs’ management, etc. The link below connects to an index of articles he’s maintaining. Might help in terms of planning for future systems and purchases.

kewms · June 21, 2022, 5:09pm

One important difference between web pages and PDFs is the very extensive use of scripts in modern web pages. Web pages are in general much less efficient than PDFs for a given amount of user-visible content.

chrillek · June 21, 2022, 5:15pm

For what measure of efficiency?

kewms · June 21, 2022, 5:22pm

The one I specifically had in mind was computational load. The rendering device has to parse any scripts even if it’s ultimately going to discard them. PDFs simply don’t include the elements that tend to make web pages slow.

rmschne · June 21, 2022, 6:15pm

PDFs in my world tend to be big (mb) and the whole file has to be downloaded before it becomes readable. A server computing HTML files to client regardless of the server’s computational load seems to send smaller files and I suspect browsers can display partially downloaded pages to make it seem to go fast for the reader. Just my view.

kewms · June 21, 2022, 6:23pm

Right, but here we’re talking about a situation where the user is trying to manage several hundred thousand bookmarks. That’s a lot of visits to various servers.

I’ve made good use of DT’s Split function to make large PDFs more tractable.

cubicray · June 21, 2022, 6:41pm

I’m not following, what do you mean lots of visits to various servers? I use list view in DT and I’m not opening any webpages inside DT. I use Safari to open the bookmarks.

rcwright · June 24, 2022, 4:21am

Something else to consider - no matter what Mac you get, you’ll have two computers after your purchase. If you network them, you should be able to divide your database across the two.

BLUEFROG · June 24, 2022, 6:00am

If you network them, you should be able to divide your database across the two.

Can you clarify what you’re suggesting here?

rcwright · June 24, 2022, 6:29am

Above, rfog: “I have about 500 GB mostly of PDF across 18 databases (and billions of words), all synchronized between 3 Macs, 2 iPad and one iPhone.” Clearly, all these are somehow networked. It seemed to me that his existing data base may be distributed over the two machines. It might be possible to do searches, say, over the two machines and then merged. Back in my working days, I used to set up 3 or 4 asynchronous queries, and process the results after all were finished, sometimes setting more asychon sessions down the production line.

chrillek · June 24, 2022, 7:05am

Without a server running on at least one of them? Hardly.

rfog · June 24, 2022, 12:19pm

I have all in same Mac. In fact, I have the same in 3 Macs and 3 iOS devices (only on-demand), synchronized across a WebDAV server in my NAS.

Each Mac has its own complete local DT databases set in its 1 TB disk. Difference is search speed.

Having two Macs without enough disk space, you can put databases in external disk (I had that in my old MacBook Pro 2012, external disk glued with Velcro in the lid and attached with a normal USB short cable), or if you are patience enough and have very good connection, in a NAS.

Or put half of your databases in one Mac and the other half in the other, but they will be completely independent.

Ryan_N · June 27, 2022, 5:56pm

The timing of this question was quite great as I too am overdue for a hardware upgrade, and running practically the same current hardware as @cubicray (in my case; a late 2013 MBP w/ 4 core i7 (2.66 GhZ), 16GB RAM).

My goal is to merge two DT databases into one. These databases were split only because of hardware limitations, so I usually only run one at a time–which is not ideal at all because I constantly forget to search both of them.

Database-1 is:

162-million words; 1.7 million unique
3,748 groups
4,643 PDFs

Database-2 is:

89-milloin words; 881,000 unique
all items are old books and journals OCRed through DT (so, ABBYY API)

Glancing on the Apple Store website, I see the 16" MBP with the new M1 Max processor starts out w/ 32 GB RAM as the baseline. This can be doubled for $400 USD, and $400 bucks seems reasonable in comparison to the overarching cost of a new machine.

I like saving a buck though, so my $400 question is; is this much RAM truly worth it? Or is this extreme overkill given the crazy-impressive benchmark metrics on these new unified M1 chips?

My DT databases are comprised of only textual documents, but some are quite large files in word-count and file size (i.e. old books and journals that I re-OCRed using DT/ABBYY). I’ve been known to try crazy things before though, like combining an entire multivolume set of books or journal issues into one ginormous PDF (sometimes RTF) for convenience in using DT’s amazing NEAR operator across the entire publication, without having to jump to a new document. …but when I try putting this grand idea into practice on this old dog of a machine, I realize instead I must go reach for a fire extinguisher (beachballing, fan on high, etc.).

So what do you think, fellas? Is 64 GB RAM completely insane?

Thanks in advance!

BLUEFROG · June 27, 2022, 7:50pm

So what do you think, fellas? Is 64 GB RAM completely insane?

Is a 12 cylinder engine in a supercar “insane”? I think not.
If you can comfortably afford it - the car or the RAM - I’d go for it.

Blanc · June 27, 2022, 7:58pm

Just as an aside, you may want to be aware that the entry level M2 MBP appears to have a much slower SSD than its M1 counterpart (source: The Verge). This appears not to apply to the larger SSDs.

Ryan_N · June 28, 2022, 5:11pm

Touché!

…but, I mean, if you said doubling RAM beyond 32GB is effectively pointless in DT-land due to little or no improvement in speed, I would believe you.

I remember you saying one time that 200-million words is sort of a high limit for a DT database (or perhaps any database on a local hard drive).

This does beg the question; is the limitation hardware, or something else?

As these machines keep getting faster, and immense amounts of RAM more affordable, I can’t help but wonder if I could combine two huge databases into one without beach-balling?