Gemma3n:e4b is a great local model for renaming and summarization

SopShuttle · February 14, 2026, 9:34pm

I revisited some my auto-renaming workflow this morning and especially wanted to evaluate different models to see if I could find a better one.

On my machine (Macbook Air M2 16GB) I found that the gemma3n:e4b model had the best results among the others that I tested against a handful of documents, including:

gpt-oss:latest
glm-4.7-flash:latest
deepseek-r1:1.5b
mistral:latest
gemma3:latest
llama3.1:8b
lfm2.5-thinking:latest

(Sorry for the latest tags, I already deleted the models and didn’t record which specific flavor of model for those).

The smaller models (≤ 4B) tend to return filenames that don’t quite fit the request (YYYY-MM-DD Company Title), and the larger ones take a really long time (> 30s) and bring all other tasks to a halt, preventing me from doing any other work in the background.

I tried gemma3n:e4b for the first time today and it turned to be a great balance between speed an accuracy. Working with it in the chat window also gave fairly reasonable results.

The last flourish I added to my workflow was for the Smart Rule to play a sound when the execution completes, which allows me to switch my attention away until I hear the sound.

BLUEFROG · February 14, 2026, 9:41pm

Indeed, there is no “one size fits all” and finding the model you prefer takes time and testing.

The last flourish I added to my workflow was for the Smart Rule to play a sound when the execution completes, which allows me to switch my attention away until I hear the sound.

I use this method very often, especially as I work on multiple devices. I can hear, e.g., a Sosumi and know a particular rule just finished.

PG66 · February 15, 2026, 6:52pm

Hi there, this sounds very interesting to me. How would such a renaming workflow look like? Is it possible to correct automatically typos etc.?

Many thanks!

Peter

BLUEFROG · February 15, 2026, 9:20pm

Note: Renaming documents does not require using external AI. In fact, it most often doesn’t and AI can be much, much slower.

Correct typos in what, based on what?

SopShuttle · February 22, 2026, 10:01pm

The way my workflow works:

Hit “Scan to Devonthink” on my Fujitsu Scansnap
Once the document hits the Devonthink Inbox, it triggers my “AI Rename File” rule which queries the LLM to automatically rename the file.

faetonmuldy · February 23, 2026, 1:07pm

Thank you for sharing this information. Indeed, gemma3n:e4b features a better balance between performance/speed and quality on my MacBook Pro M1 with 64 GB RAM than the previously tested models, such as Mistral-Small:3.2 or qwen3:8b. I also install larger models via native mlx-lm (e.g. qwen3-32b-mlx-6bit), which makes them run faster tham ollama local, but unfortunately they are not available in DEVONthink.

cgrunenberg · February 23, 2026, 1:14pm

The latest versions of Ollama added MLX support. Another option might be LM Studio.

rkaplan · February 23, 2026, 1:49pm

Can you share a bit more about how you run the MLX models and how much of a performance improvement you have observed with it?

faetonmuldy · February 23, 2026, 2:13pm

As far as I know, MLX support in Ollama (announced in 2025) is still experimental and hasn’t made it to the stable releases yet. LM Studio is built in Electron, which consumes extra memory. I prefer a native installation using mlx-lm.

faetonmuldy · February 23, 2026, 2:17pm

Once you install mlx-lm and download a specific model, you can run it in chat mode right from the terminal: e.g., mlx_lm.chat --model mlx-community/c4ai-command-r-08-2024-4bit Alternatively, you can use a Python script to load the model and execute specific tasks.