Sharing Local LLM Experiment: LlamaBarn vs Ollama using glm-4.7-flash-q4 om M2 Max 32 GB Studio

axelbuehler · February 22, 2026, 12:44pm

Sharing Local LLM Experiment: LlamaBarn vs Ollama using glm-4.7-flash-q4 om M2 Max 32 GB Studio

take away

- becoming usable (translate / summarize 10 page pdf locally within 1 minute and meaningful output)

- LlamaBarn seems slightly faster

- outputs have consistent message, but still differ wildly in details between chats with same set-up, same prompt (somewhat expected, but …)

- output between LlamaBarn and Ollama has a slightly different feel, Ollama more structured in recommendation (unexpected)

Just read about LlamaBarn from developers of llama.cpp that enables running llms locally.

Llama.cpp is also under the hood of Ollama and wanted to benchmark them.

Tested both with local glm-4.7-flash-q4 model.

Set-up

- Ollama with native local integration in DEVONthink - 16k context window, all assistants allowed

- LlamaBarn with generic OpenAI compatible integration, adjusting local IP addess - 32k context window, all assistants allowed

(http://localhost:2276/v1/chat/completions)

Prompt:

pls read this text in German and give me a short summary in English and the bottom line conclusion

source:

- pdf of Druckversion - Gas Town: Mad-Max-inspiriertes Framework für Coding-Agenten | heise online

- 10 pages

Results

- timewise comparable, Ollama seems to be a bit slower (not set up very scientifically) < 1 minute

- LlamaBarn used up more memory (probably due to larger context), but kept working; Ollama stayed green in respect to memory

Testing three times with LlamaBarn and once with Ollama I got three different results (as to be expected) - but the feel of the Ollama response is different.

LlamaBarn I (16k context)

12:30:53 (axelbuehler)

pls read this text in German and give me a short summary in English and the bottom line conclusion

12:31:47 (assistant)

Summary

Gas Town is a framework developed by Steve Yegge that orchestrates multiple coding agents simultaneously, inspired by the post-apocalyptic Mad Max movie series. The system uses a themed architecture with distinct roles:

Mayor (Bürgermeister): Global coordinator and main interface to humans
Deacon (Wächter): System watchdog that monitors and restarts stuck processes
Witness (Aufseher): Monitors individual agents within each Rig
Polecat (Arbeiter-Agent): Short-lived worker agents that execute tasks in Git worktrees
Refinery (Raffinerie): Merge queue manager that resolves conflicts and ensures code quality
Crew: Long-lived, named agents for persistent collaboration

The framework addresses common challenges with multiple agents: merge conflicts, duplicate work, and technical debt accumulation. It implements concepts like non-deterministic idempotency, Beads for task management, and Convoy for bundling tasks. The system requires significant resources, including Claude Max subscriptions ($100-200/month) and familiarity with tmux.

Bottom Line

Gas Town is an advanced orchestration framework for experienced developers who already understand the limitations of individual coding agents. It’s not suitable for beginners or those on a budget due to its steep learning curve and high operational costs. While currently limited to Claude Code and Codex CLI, it represents an ambitious approach to managing the complexity of multi-agent systems through structured chaos rather than perfect individual agents.

Source: Ingo Eichhorst, “Gas Town: Mad-Max-inspiriertes Framework für Coding-Agenten”, Heise Medien, 21.02.2026, Druckversion - Gas Town: Mad-Max-inspiriertes Framework für Coding-Agenten | heise online

LlamaBarn II (32k context)

12:48:14 (axelbuehler)

pls read this text in German and give me a short summary in English and the bottom line conclusion

12:49:26 (assistant)

Summary

Gas Town is a framework developed by Steve Yegge that orchestrates multiple AI coding agents simultaneously, inspired by the post-apocalyptic Mad Max universe. The system uses a hierarchical architecture with distinct roles:

Mayor (Bürgermeister): Main coordinator and human interface
Deacon (Wächter): System watchdog and cleanup agent
Witness (Aufseher): Monitors individual agents within each project
Polecats (Arbeiter-Agenten): Worker agents that execute tasks in Git worktrees
Refinery (Raffinerie): Manages merge conflicts and code quality
Crew: Named, persistent agents for collaboration
Rig: Project repository management
Convoy: Bundled task groups
Beads: Git-based atomic work units

The framework addresses challenges like context window limits, merge conflicts, duplicate work, and technical debt accumulation. It requires significant resources (Claude Max subscriptions of $100-200/month), Go 1.23+, Git 2.25+, tmux, and is currently limited to Claude Code and Codex CLI. The system is still in alpha and has a steep learning curve, targeting experienced developers who already work with coding agents.

Bottom Line

Gas Town is an ambitious orchestration framework for managing multiple AI coding agents, designed to transform chaotic parallel agent work into controlled, structured development workflows. However, it’s not suitable for beginners-it requires substantial financial investment, technical expertise, and patience to navigate its complex architecture. The framework shows promise for experienced developers seeking to scale AI-assisted coding but is best approached as an advanced tool for those already comfortable with agent-based workflows.

Source: Ingo Eichhorst, “Gas Town: Mad-Max-inspiriertes Framework für Coding-Agenten”, Heise Medien, 21.02.2026

LlamaBarn II (32k context) - exactly 1 min with cold start, looked like about 10 sec to load model into memory

13:32:53 (axelbuehler)

pls read this text in German and give me a short summary in English and the bottom line conclusion

13:33:55 (assistant)

Summary

Gas Town is a framework developed by Steve Yegge that orchestrates multiple coding agents simultaneously using a Mad Max-inspired architecture. The system manages the chaos that arises when multiple AI agents work on the same tasks by implementing concepts like:

Control Plane: Mayor (coordinator), Deacon (system watchdog), Witness (monitors agents)
Data Plane: Polecats (worker agents), Refinery (merge conflict manager), Crew (persistent agents)
Key mechanisms: Git worktrees, mailboxes, handoffs, Beads (task management), and Sweeps (garbage collection for technical debt)

The framework requires significant resources including Claude Max subscriptions (100-200 USD/month) and is currently in alpha stage. It supports various workflows and is designed for experienced developers who already understand the limitations of individual AI agents.

Bottom Line Conclusion

Gas Town is not suitable for beginners-it’s a complex, resource-intensive orchestration framework for experienced developers who want to coordinate multiple AI coding agents. While it can manage the chaos of parallel agent work, it requires substantial financial investment and technical expertise. The framework is currently limited to working with Claude Code and Codex CLI, and its effectiveness depends on the continuous improvement of underlying AI models.

Source: Ingo Eichhorst, “Gas Town: Mad-Max-inspiriertes Framework für Coding-Agenten”, Heise Medien, 21.02.2026.

Ollama:

12:51:27 (axelbuehler)

pls read this text in German and give me a short summary in English and the bottom line conclusion

12:53:01 (assistant)

Short Summary

Gas Town is a Mad Max-inspired orchestration framework for coordinating multiple AI coding agents simultaneously. Developed by Steve Yegge and released in January 2026, it addresses the chaos that emerges when multiple coding agents work in parallel on complex tasks. The framework organizes agents into roles (Mayor as coordinator, Deacon as system watchdog, Witness as supervisor, Polecats as workers, Refinery as merge queue manager) inspired by Mad Max: Fury Road. Key concepts include:

Beads - Git-based atomic work units for task management
Convoy - Bundled task groups for worker agents
Sweeps - Systematic correction waves to address technical debt
Cattle vs Pets approach - Agent instances are disposable and replaceable

The system uses Control Plane (Mayor, Deacon) and Data Plane (Polecats, Refinery, Witness) architecture similar to Kubernetes.

Bottom Line Conclusion

Gas Town is an advanced orchestration framework for experienced developers who want to coordinate multiple AI coding agents, but it comes with significant caveats:

Target audience: Only for developers already familiar with multi-agent workflows who are willing to invest heavily ($100-200/month in Claude subscriptions, tmux mastery, and steep learning curve)
Current status: Still in Alpha phase with practical issues like agents frequently getting stuck
Value proposition: Provides structure to manage chaos (merge conflicts, duplicate work, technical debt) through systematic approaches like Sweeps and Refinery
Not for beginners: Requires extensive knowledge of coding agents and system administration

Gas Town represents an ambitious attempt to bring Kubernetes-like orchestration concepts to AI coding agent workflows, offering potential for high-throughput development but demanding significant resources and expertise.