Blog

Insights and updates from the Supermodel team- Building graph-based world models for coding agents, Benchmarking MCPs, building useful primitives for code factories and more.

March 30, 2026

What Dead Code Taught Us About Building Tools for AI Agents

We benchmarked AI agents on dead code detection across 60+ runs and 14 real-world repositories using Claude Opus 4.6. Graph-enhanced agents achieved 94.1% F1 with 100% precision. The result -- 156x cheaper, 11x faster, 2x better performance than the baseline agent alone. Here's what we learned about context engineering, honest benchmarking, and why code graphs are the missing primitive for software factories.

Jonathan Popham

Why Your Weekend Code Graph Project is Bullshit

You discover tree-sitter, parse a codebase into a graph, render it with a force-directed layout, post a screenshot. It gets hundreds of likes. Then you try it on a real codebase and everything falls apart.

Lance Robertson

February 25, 2026

How We Split a Monolith Into a Control Plane and Data Plane (and Got 10x Scale)

How we redesigned a synchronous monolith into an async control plane and data plane in one calendar week — achieving 10x scale with zero new infrastructure.

Grey Newell

February 13, 2026

Everyone Is Benchmarking MCP Servers Wrong

Existing MCP benchmarks rank models, not servers. Here's how to A/B test whether your MCP server actually improves agent performance.

Grey Newell

February 6, 2026

Why We Built mcpbr

MCP developers ship tools without evidence they work. We built mcpbr to find out. Results from a 500-task controlled SWE-bench experiment.

Grey Newell

Blog

What Dead Code Taught Us About Building Tools for AI Agents

Recent Posts

Why Your Weekend Code Graph Project is Bullshit

How We Split a Monolith Into a Control Plane and Data Plane (and Got 10x Scale)

Everyone Is Benchmarking MCP Servers Wrong

Why We Built mcpbr