Fabro turns agent orchestration into a version-controlled graph with per-node model routing

The Rundown No. 115 · Audio Edition · 7 min All episodes RSS MP3

0:00 / 7:21

VTT

Oday

An open source tool just split the two decisions every agent stack welds together: what a step does, and which model pays for it.

Shannon

It's Monday, June 15, 2026. Here's the rundown.

Shannon

We've got Fabro and per-node routing up top, a wire run through dev tools and models, quick hits, and the call.

Oday

Fabro shipped as an open source orchestrator for AI coding agents. It's Rust, MIT-licensed, sitting at eight hundred thirty-four stars and ninety-six forks as of its June third commit.

Shannon

And the tagline is "dark software factory," which buries the part that matters. Ignore the branding.

Oday

You define your process as a workflow graph, agents execute the nodes, and you step in only at the gates you marked. The pipelines are written in Graphviz DOT.

Shannon

That's the clever move. Branching, loops, parallel fan-out, human approval gates. They're all just graph structure, not custom code.

Oday

Then there's a separate stylesheet, CSS-like, that decides which model and provider runs each node. The example routes default nodes to Haiku and anything tagged coding to Sonnet.

Shannon

That's the whole game right there. Most stacks hardcode the model name into the agent loop, so what a step does and which model runs it become the same line of code. Fabro pulls them apart.

Oday

Cheap model for triage, expensive model for the hard node, fallback chain when a provider rate-limits you.

Shannon

And that's the difference between a demo and unit economics. A workflow that runs Haiku on ninety percent of nodes and Sonnet on the ten percent that actually reason survives contact with traffic. A workflow that calls Sonnet everywhere just burns money quietly.

Oday

Execution is isolated, too. Each agent runs in its own Daytona cloud VM, with snapshot setup, network controls, automatic cleanup, and every stage commits both code and run metadata to Git branches.

Shannon

That last bit is underrated. You get an audit trail the agent can't quietly overwrite, and a rollback point per node instead of per run.

Oday

So if you're building tonight, what do you copy?

Shannon

Not the repo. The routing layer. Pull model choice out of your agent code into config keyed on node type, then add a fallback order per key. Pair it with per-node commits and you get cost attribution for free.

Shannon

You can see exactly which graph node burned the tokens. Right now most teams can't answer that question at all.

Oday

And the signal you're reading from this?

Shannon

The agent loop is becoming a commodity. The orchestration around it is where the differentiation goes, and it's showing up in small open source tools first because the big frameworks still treat the model as one global app setting.

Oday

Puppeteer quietly became an action layer for AI agents. The headline still says JavaScript API for Chrome and Firefox, but the load-bearing change is chrome-devtools-mcp, a Puppeteer-based MCP server, plus support for the experimental WebMCP API.

Shannon

So a scrape-and-test library becomes a standard tool surface for agents. Latest is twenty-five point one, roughly nine thousand eight hundred projects already depend on it. If you're building a browser agent, you no longer write the tool wrapper yourself.

Oday

Next, PlanetScale makes a sharp claim: the only scalable delete in Postgres is DROP TABLE.

Shannon

And they're right. Large row-by-row deletes never scale. Bloat, vacuum pressure, index churn. Design the tables so the data you'll purge lives in a partition you can drop whole, and read that before you write a cron job deleting millions of rows nightly.

Oday

Two more quickly. Kage drives headless Chrome to capture a fully rendered page, then strips every script tag and handler so the saved copy runs no code and makes no network calls.

Shannon

An inert offline mirror, with JS-assembled content captured instead of an empty shell. And then zeroserve claims three-x throughput and seventy percent lower latency with Caddy compatibility.

Oday

That one's a single author's benchmark.

Shannon

So treat the numbers as a starting point and run your own. The actual draw is the drop-in path for teams already on Caddy syntax, not the figures.

Oday

A GitHub issue alleges Nex-N2, promoted as Rio de Janeiro's locally built model, is a merge of an existing open model rather than original training. A related claim says "Rio three-five" beats Qwen three-seven on benchmarks.

Shannon

Benchmark wins mean nothing when the provenance is in question. If you're evaluating any sovereign or municipal model, check the weights and the lineage before you trust the leaderboard.

Oday

And DuckDuckGo's Gabriel Weinberg argues AI adoption is fragmented, not universal. People consume AI the way they eat meat: some embrace it, some limit it, some avoid it.

Shannon

It's an analogy essay, no AI-specific numbers, so read it as a frame, not a finding. The useful part: stop assuming your whole market wants AI features on by default.

Oday

Paul Graham published "How to Earn a Billion Dollars." He reduces the outcome to two variables, growth rate and market duration, with the engine being to build something people love enough to refer others. Cloudflare's Matthew Prince said publicly it matched his own path.

Shannon

One caution. The viral ninety-three percent monthly growth and five hundred twenty-six million figures come from secondary analysis, not Graham's page. Don't quote them as his.

Oday

Quick break — two from the desk.

Shannon

One we know well: vote dot direct. If you're on an H O A or a board, it runs your elections digitally — secure, verifiable, no paper, no clipboard in the lobby. Point your council to vote dot direct.

Oday

And if this is your ten minutes of A I for the day, get the written edition too. The full wire, free, every morning — leave your email at nextbig dot dev.

Oday

Jane Street details where formal methods fit in real production engineering, including an OxCaml prover.

Shannon

Yserver is a modern X11 server written from scratch in Rust.

Oday

A 2014 talk arguing JavaScript becomes a universal compilation target resurfaced on Hacker News with a hundred fifty-eight points.

Shannon

Kobo rejects valid ePub files, and the post traces the blame to Adobe's tooling.

Oday

And Swiss voters rejected a proposal to cap the national population at ten million.

Oday

Our call: within ninety days, at least one of LangGraph, CrewAI, or LlamaIndex ships declarative per-step model routing with automatic provider fallback as a first-class config primitive, not a hand-rolled callback.

Shannon

We're wrong if, by September fifteenth, none of the three documents stable support for assigning a different model per node with built-in fallback ordering. That's when it settles.

The Big Story

Fabro turns agent orchestration into a version-controlled graph with per-node model routing

Fabro shipped as an open source orchestrator for AI coding agents, and the tagline of "dark software factory" buries the part that matters. You define your process as a workflow graph, agents execute the nodes, and you step in only at the gates you marked. It is Rust, MIT-licensed, and sits at 834 stars and 96 forks as of its June 3 commit. The container serves both the API and an embedded web UI on a single port, default 32276, and persists state to /storage, with a Render blueprint that provisions a 1 GB disk out of the box.

The mechanism is the interesting move. Pipelines are written in Graphviz DOT, so branching, loops, parallel fan-out, and human approval gates are all just graph structure. A separate CSS-like stylesheet decides which model and provider runs each node, with automatic fallback chains when one fails. The published example routes default nodes to claude-haiku-4-5 and any node tagged .coding to claude-sonnet-4-5. That single design choice splits two decisions most agent stacks weld together: what a step does, and which model pays for it. Cheap model for triage, expensive model for the hard node, fallback when a provider rate-limits you.

Execution is isolated. Each agent runs in its own Daytona cloud VM with snapshot setup, network controls, and automatic cleanup, and every stage commits both code changes and run metadata to Git branches. That gives you an audit trail an agent cannot quietly overwrite, and a rollback point per node rather than per run.

If you are building agent workflows today, the thing to copy is the routing layer, not the repo. Pull the model choice out of your agent code and into config keyed on node type, then add a fallback order per key. A workflow that runs Haiku on 90% of its nodes and Sonnet on the 10% that do real reasoning is the difference between a demo and a unit economic that survives contact with traffic. Pair it with per-node commits and you get cost attribution for free: you can see which graph node burned the tokens.

The signal for the next six months is that the agent loop is becoming a commodity and the orchestration around it is where the differentiation goes. Routing, fallback, isolation, and provenance are infrastructure concerns, and they are showing up first in small open source tools because the big frameworks still treat the model as a global app setting. Whoever ships declarative per-step routing as a first-class primitive captures the teams who are tired of hardcoding model names in a hundred call sites.

@github Read source 270 engagement

Developer Tools

Puppeteer quietly became an action layer for AI agents

The npm one-liner still says "JavaScript API for Chrome and Firefox," but the load-bearing change is chrome-devtools-mcp, a Puppeteer-based MCP server, plus support for the experimental WebMCP API. That exposes browser control to LLM agents as a standard tool surface, turning a scrape-and-test library into an agent action layer. Latest version is 25.1.0, it drives Chrome or Firefox over DevTools Protocol or WebDriver BiDi, and roughly 9,838 projects already depend on it. If you are building a browser agent, you no longer need to write the tool wrapper yourself.

@github Read source 295 engagement

Kage strips JavaScript from a fully rendered page to make an inert offline mirror

Kage drives real headless Chrome to capture a page the way a human sees it, then removes every script tag, on* handler, and javascript: URL so the saved copy runs no code and makes no network calls. JS-assembled content is captured fully instead of as an empty shell. Output is a browsable folder, a ZIM archive, or with --format binary a single self-contained executable that serves the site with nothing installed on the recipient's machine. The single-binary mode is one option, not the point.

@newsycombinator Read source 437 engagement

PlanetScale: the only scalable delete in Postgres is DROP TABLE

PlanetScale argues that large row-by-row deletes in Postgres never scale because of bloat, vacuum pressure, and index churn, and that partition-and-drop is the only pattern that stays cheap as data grows. Worth reading before you write a cron job that DELETEs millions of rows nightly. The fix is to design tables so the data you will purge lives in a partition you can drop whole.

@newsycombinator Read source 126 engagement

Hetty is a single-binary, open source Burp Suite Pro alternative

Hetty is a Go HTTP toolkit for security research: a MITM proxy with logging and search, an HTTP client to replay requests, interception, and scope management, all behind a web admin UI. It ships as a Docker image on port 8080. Nothing is newly announced here, it is a long-running project resurfacing, but if you are paying for Burp Pro seats it is worth a look.

@github Read source 1,745 engagement

zeroserve claims 3x throughput and 70% lower latency with Caddy compatibility

A write-up on adding Caddy config compatibility to zeroserve reports 3x throughput and 70% lower latency against the comparison baseline. Numbers from a single author's benchmark, so treat them as a starting point and run your own, but the appeal is a drop-in path for teams already on Caddy syntax.

@newsycombinator Read source 128 engagement

AI & Models

Rio de Janeiro's "homegrown" city LLM looks like a merge of an existing model

A GitHub issue alleges that Nex-N2, promoted as Rio de Janeiro's locally built model, is a merge of an existing open model rather than original training, even as a related claim says "Rio3.5" beats Qwen3.7 on benchmarks. The lesson for builders evaluating any sovereign or municipal model: check the weights and the lineage before you trust the leaderboard. Benchmark wins mean little when provenance is in question.

@newsycombinator Read source 230 engagement

Weinberg: AI adoption is fragmented, not universal

DuckDuckGo's Gabriel Weinberg argues people consume AI the way they eat meat, some embracing it, some limiting it, some avoiding it, mapping the health, cost, environment, and ethics reasons for cutting meat onto AI. It is an analogy essay, not a study, and it cites no AI-specific adoption numbers, so read it as a frame rather than a finding. The useful takeaway: stop assuming your whole market wants AI features on by default.

@newsycombinator Read source 764 engagement

Startups & Capital

Paul Graham publishes "How to Earn a Billion Dollars"

Graham's new essay reduces the billion-dollar outcome to two variables, growth rate and market duration, with the engine being to build something people love enough to refer others. Cloudflare CEO Matthew Prince publicly said the guidance matched his own path. The viral 93% monthly growth and $526M figures come from secondary analysis, not Graham's page, so do not quote them as his.

@newsycombinator Read source 1,677 engagement

Quick Hits

Jane Street details where formal methods fit in real production engineering, including an OxCaml prover

@newsycombinator

Yserver is a modern X11 server written from scratch in Rust

@newsycombinator

A 2014 talk argues JavaScript becomes a universal compilation target, resurfacing on HN with 158 points

@newsycombinator

Kobo rejects valid ePub files and the post traces the blame to Adobe's tooling

@newsycombinator

Trace ships offline Mac meeting transcripts you can flag mid-call

@newsycombinator

Swiss voters reject a proposal to cap the national population at ten million

@newsycombinator

A blog post traces Lisp's direct influence on Ruby's design

@newsycombinator

The Takeaway

Pull model choice out of your agent code this week. Fabro shows the win from routing each workflow node to its own model with a fallback chain, and Puppeteer's MCP server shows tools standardizing into a shared action surface. If you are building agents, move model assignment into config keyed by step and add per-node cost logging before your token bill stops mapping to anything you can explain.

The Call C-20260615

Within 90 days, at least one of LangGraph, CrewAI, or LlamaIndex ships declarative per-step model routing with automatic provider fallback as a first-class config primitive, not a hand-rolled callback.

The case

Fabro already does this with a CSS-like stylesheet and fallback chains, and Puppeteer's pivot to an MCP action layer shows the value moving from the agent loop to the orchestration around it. Consensus still treats the model as a global app setting, which is why this is showing up first in small open source tools rather than the big frameworks.

What proves us wrong

By September 15, 2026, none of LangGraph, CrewAI, or LlamaIndex documents stable support for assigning a different model per node or step with built-in fallback ordering.

Settles by September 15, 2026

The Tape T-20260615

◆ Watch BABA Alibaba Group medium conviction

Rio's "homegrown" sovereign model that supposedly beat Qwen3.7 is a 0.6/0.4 merge of an open Qwen3.5-397B base, which proves Qwen is the substrate the open-weight world forks, not a product Alibaba gets paid for. Download-share leadership is a vanity metric until it converts to cloud revenue, where the only number that compounds is Cloud Intelligence at 34% YoY.

Stories [13] and [17] both trace Rio 3.5 to a Qwen3.5-397B merge, and one community read notes Alibaba stopped updating that open 397B base while third parties like Nex-AGI now ship the updates. The crowd reads 50%-plus open-weight download share as a moat; the arithmetic says forkers capture the credit and self-hosters pay Alibaba nothing. This materially updates the 6/13 and 6/14 BABA watch: the open thread guessed Alibaba was closing, today there is a live proof point that the open base is being harvested by others, so the open-source-must-win framing points at the wrong revenue line.

Wrong if Alibaba's Q1 FY2027 print around August 2026 breaks out Qwen-attributable cloud or API revenue as a material, accelerating line, showing open-weight leadership converting directly to monetization rather than leaking to forkers. Settles August 2026

◆ Watch Private Anthropic low conviction

Per-node routing to the cheapest capable model is becoming an open-source default the same week open-weight forks of a near-frontier base ship for free, and both compress the price premium that underwrites Anthropic's roughly $47B run rate and $965B IPO mark. The premium node is shrinking to the hard slice of work while the rest reroutes to cheap or open tiers.

Fabro [10] splits what a step does from which model pays for it, sending default nodes to claude-haiku-4-5 and only .coding nodes to claude-sonnet-4-5, with automatic fallback when a provider rate-limits. The Rio/Qwen episode [13] shows the cheap tier is now self-hostable open weights anyone can run. Consensus prices these labs on token volume; the second-order variable is token ASP, and routing plus open forks both push it down ahead of an October listing window.

Wrong if Anthropic's IPO prospectus or pricing within the October 2026 window holds the $965B mark or higher with API gross margin and average revenue per token flat to up, showing no pricing compression. Settles October 2026

Desk signals from the day's verified wire — falsifiable, dated, settled in public. Analysis, not individualized investment advice.

Fabro turns agent orchestration into a version-controlled graph with per-node model routing

Puppeteer quietly became an action layer for AI agents

Kage strips JavaScript from a fully rendered page to make an inert offline mirror

PlanetScale: the only scalable delete in Postgres is DROP TABLE

Hetty is a single-binary, open source Burp Suite Pro alternative

zeroserve claims 3x throughput and 70% lower latency with Caddy compatibility

Rio de Janeiro's "homegrown" city LLM looks like a merge of an existing model

Weinberg: AI adoption is fragmented, not universal

Paul Graham publishes "How to Earn a Billion Dollars"

Get this briefing in your inbox