There is some local RAG possibilities but (1) they will not quickly ingest a massive archive like people dream it will, and (2) you’re never going to compete with commerical (read online) LLMs in terms of the size and speed of the models. You have a horsier Mac than most and that’s still not going to run a 80 billion parameter model speedily.
Then look at Anthropic’s models - either using their own web front-end or with a wrapper such as Typing Mind which can add RAG capability to it.
Anthropic has made a really strong effort to assure customer privacy. You can even use it for legal or medical records. Hard to imagine why that level of privacy is not sufficient.
There is just no comparison between local vs cloud models. It’s like a an Apple 2 or TRS-80 vs. modern MacOS.
Because sometimes you just don’t want your data in anyone else’s hands. Makes perfect sense to me.
But as they say, “Better the devil you know than the devil you don’t.” So full containment sometimes limits needs from being met. It depends on the actual need and I would caution anyone to consider what they’re giving and to whom, e.g., to Google for NotebookLLM. No thank you much.
Anthropic certainly started and appears to stay the course on privacy, etc. but business pressures, money, competition, etc. all can change the direction. And remember, you don’t jam the rudder hard to the port or starboard. You gently, almost imperceptibly turn it, and if no one is paying attention, you’re headed in a completely different direction.
PS: I am not a data-paranoic. I just believe peoples’ data is theirs to hold and manage and they should be circumspect in what they do with it. Yes, you often have to trust someone outside yourself, but try to make sure of who you’re trusting.
Trying to cram two replies into one:
@rkaplan Anthtopic is the company I use when I want an interactive session. Their privacy is better than OpenAI. Their security better than DeepSeek. The other detail I didn’t mention is cost. I have an Anthropic API key. I just assume if I do RAG with an Online system my costs will go up. I didn’t know TypingMind supports RAG. I will give it a try again. (Mostly I use BoltAI)
@BLUEFROG Local models with RAG are useful to a point. This screenshot shows me using a local model to make queries in my Obsidian Vault. With a subset of a DT vault I think the same would be possible.
Herein lies the rub: subset is a very undefined term and one I think will catch out many people. One is a subset of 100,000. So is 99,999. With the costs and performance involved, I’d wager an affordable subset will usually be far, far smaller than many people expect, especially those with databases full of legal, medical, or scientific PDFs.
I would be very happy with a method to add feed a maximum 30-50 academic papers and creating a RAG index for that set. Since I work on sets of problems for a while, paying the short term CPU/RAM cost for ease of use later is a good idea.
FWIW if you tell me this is $$ add-on for DT3 or 4, I will (with reason) say take my $$.
am still in process of experimenting and learning, however that seems do-able with Ollama and AnythingLLM as a “frontend” (thanks to @Archimedes for the pointer). AnythingLLM allows to embed entire DT groups / tags by dragging and dropping their files into the LLM (DeepSeek in my case).
at least the “knowledge” of the uploaded / embedded documents seems now accessible in the LLM. only issue is that due to the vectorization / the RAG process of my documents the LLM has no way of “understanding” what’s new, so it cannot address specific questions on my uploaded documents. they “just” increase its “general knowledge”.
one step after the other…
What are you trying to protect, and from whom?
I work with legal and medical documents; I believe that Anthropic AI, when accessed via API, is safe for that purpose.
Others appear to have use cases where this legal of privacy is not sufficient. I do not know what those situations are.
Well, there are certainly cases where people just don’t trust a company to have the protections they say they do, either because they think the company is lying or they think its security model is flawed.
But there are also people who for whatever reason want to protect their data from government access. If the entity seeking your data has the ability to get a court order for it, that will trump whatever the company’s privacy policy says.
I highly recommend Msty for setting up a local RAG. No affiliation.
Limitations of local models still apply but the way it works is that it identifies chunks of relevant text from the indexed documents through a quick semantic search. Then it feeds only this limited amount of chunks to the local LLM.
With 14B models like Phi4 the response does not take longer than 20-30 seconds on a medium end MacBook and is generally of good quality.
Local embedding models like mxbai are available via Msty and perform the same as cloud embedding models.
Reasoning models are not necessary for RAG and arguably not the best choice either for this use case.
So breaking the law is OK if they decide it’s in their interest?
How about “opt-in”, not “opt-out”. It’s like saying we’ll break into your property and steal your stuff unless you tell us you don’t want us to break in.
No question there is an outstanding issue of copyright law regarding training of AI models - just like the issue of whether Google exceeds copyright “fair use” in its spidering has never been resolved.
That said - there is a major difference between litigation over fair use or derivative work definitions for public copyrighted material vs. training on private documents uploaded only to a user’s account. Anthropic is very clear on the opt-in/opt-out terms of the latter.
Do you think it’s something very different from what happens in human brains on the neuron level? “Supernatural pure thoughts” come first from “unknown somewhere”? ))
I’m not a neuroscientist, but my impression is that those who are believe that comparing machine learning to “what humans do” betrays a very shallow understanding of human cognition.
(For a more rigorous discussion on this point, see van Rooij et al., Computational Brain & Behavior (2024) 7:616–636 Reclaiming AI as a Theoretical Tool for Cognitive Science | Computational Brain & Behavior)
Among other things, mechanistic models of the “brain as computer” are simply wrong. Human memory simply does not behave in the same way that computer memory does. Neither does human information retrieval.
(And that’s without even considering that humans use less energy than an incandescent bulb to do tasks for which data centers need dedicated power plants.)
Simply breathtaking, isn’t it
That’s true - but it on’y takes a brief perusal of any social media site to realize most people prefer an echo chamber which confirms pre-conceived beliefs rather than actual use of human brain power. AI does great for that “use case.”
I’m also no neuroscientist. But I’ve seen the argument that a key factor in human cognition is embodiment, and I find this quite persuasive. The mind–body duality is prevalent in western thought, but it’s just a mental construct and does not correspond to reality. Learning and reasoning is a process that depends on interaction with and feedback from our environment. We develop an internal model of the world that gets constantly updated through external feedback. This seems impossible without embodiment.
PS: the DOI was incomplete, this is the full link: https://doi.org/10.1007/s42113-024-00217-5
Sorry about that. Fixed.
And embodiment is another example of how the mechanistic model breaks down. The eyes are not “optical sensors” that feed “data” to the “CPU.” Visual processing takes place in a variety of ways depending on exactly what is being processed: “you’re about to be hit, duck!” is different from facial recognition is different from reading.
We’re only beginning to understand how that works for vision, and vision is probably better understood than other senses.
I firmly believe that the biggest threat AI poses is that lazy humans will assign it tasks that it’s not actually capable of handling.