I have explored various open source LLMs recently.
I have been stunned by the capability of Qwen3 235B in particular - in many cases it comes close to Claude 4.5 in its summary of complex documents in particular. There are some exceptions but as a broad generalization I would say Qwen3 235B can do 95% of what I do with Claude 4.5 for a cost of about 1%. That is obviously a stunning comparison.
I was wondering if I am missing something - but indeed there are articles suggesting this may be the first open source LLM to compete with commercial models.
I was concerned about privacy initially - but there is a version of Qwen3 235B which is inclued in OpenRouter’s zero data retention group. Moreover I suspect it may be possible to run this on a very capable Mac such as the newest Mac Studio which is anticipated for release this year - running it locally with Ollama would eliminate any privacy concerns.
I will continue to compare the two - but the cost vs feature comparison is quite notable here.
Qwen 3 235B under Ollama will run on an M2 or better Mac with 128Gb or more unified memory. You need to run the 4-bit quantizied version. That achieves about 30 tokens per second, compared with 80+ tokens per second via the cloud.
Agreed that’s not a typical Mac. And even then it’s a bit slow for live chatbox work. But for running reports in the background it’s fine.
My thought wasn’t that most readers here would set it up on a Silicon mac with that much memory. Rather you can run it using OpenRouter on any computer that runs DT4; and if you set your OpenRouter profile to only use Zero Data Retention providers that is a good step toward privacy.
The price on OpenRouter is trivial - literally 1% of the cost of Claude 4.5. If you are willing to forego the zero data retention option there is even a free version of Qwen 3 235B on OpenRouter - but I wouldn’t do that unless your use case has no privacy concerns at all.
The fact that it can be run locally on a Silicon mac may be of interest to those who have specific privacy concerns such as those handling medical records subject to HIPAA. Anthropic will sign a HIPAA BAA but it takes a lot of pursuading for them to do it for other than an Enterprise account. An LLM that can run locally is even better than a cloud LLM with a BAA from a HIPAA Perspective.
“will run” is generous Also, 4-bit is another big tradeoff. Necessary for local LLMs? Yes, but less than ideal in rigorous work (if someone has any).
For running a report in the background? It’s quite viable as a solution. Doesn’t matter if it takes a few hours if I am creating a static report and I do not have to intervene for it to work.
From a privacy perspective a local LLM beats even a HIPAA compliant cloud app.
But that aside - for routine chatbox work on DT4 Qwen is basically as good as Claude for many purposes with an incrediby lower cost.
I’m likewise finding the Qwen family to be promising. I don’t know about 95% as good, but results I would’ve been very happy to pay more for a year ago. The pricing pressure is good for consumers. I’d like to see the pressure result in Anthropic and OpenAI bundling API usage into their higher subscription tiers.