Builder's Briefing — April 28, 2026
Good morning and welcome to Builder's Briefing for April 28th, 2026. I'm Alex, here with Sam, and we have a packed show today. The Microsoft-OpenAI exclusive partnership is officially over, Copilot is moving to usage-based billing, and there's a silent Postgres crisis you need to know about.
Yeah, it's one of those days where you read the headlines and think — okay, the landscape is actually shifting under our feet. Let's get into it.
So the big story: Microsoft and OpenAI have ended their exclusive partnership and their revenue-sharing arrangement. This is huge. OpenAI is no longer locked to Azure for distribution, and Microsoft no longer has to share revenue on AI compute. This is arguably the biggest structural shift in the AI platform landscape since GPT-4 launched.
Right, and what's wild is how immediately actionable this is. If you've been building on Azure-only OpenAI endpoints, you should be abstracting your inference layer like, today. Because OpenAI models are going to start showing up on GCP, AWS, independent inference providers — the whole deal.
Exactly. And the pricing angle is fascinating. Microsoft just lost its moat as the sole enterprise channel for OpenAI's best models. So Azure AI pricing has to get more aggressive now, which honestly is great for everyone.
And if you're a startup that's been going back and forth between OpenAI-via-Azure and the direct OpenAI API — the direct path just became way more strategically sound. OpenAI now has every incentive to make their own platform stickier, since they can pursue enterprise deals on their own terms.
The takeaway here is that model access is getting commoditized at the distribution layer. The winners are going to be builders who stay model-agnostic and treat inference as a swappable commodity. And that segues perfectly into some of the AI tooling that's trending today.
Oh, you're talking about Shimmy. I love this one.
Yes! Shimmy is a single-binary Rust inference server — no Python runtime needed — that runs GGUF and SafeTensors models with hot-swap and auto-discovery. And crucially, it speaks OpenAI's API dialect. So in a world where we're all trying to be model-agnostic, this is a drop-in replacement for local inference.
That's interesting because if you pair that with the Microsoft-OpenAI decoupling story, you see the whole picture. You can point your app at OpenAI's API today, swap to a local model via Shimmy tomorrow, and your application code doesn't change. That's the dream.
Another one I want to flag — Chrome is shipping a built-in Prompt API that lets web apps call on-device models directly. Client-side summarization, classification, form assistance, all without any API costs or round-trips to your server.
So between Shimmy for local server inference and Chrome's Prompt API for client-side, the trend is unmistakable — AI inference is decentralizing away from the big cloud endpoints. Builders who get ahead of this are going to have a real cost advantage.
Also worth a quick mention — Dirac, an open-source terminal coding agent, just topped TerminalBench using Gemini 3 Flash Preview. Gemini's flash-tier models are showing up as genuinely viable for agentic workloads at a fraction of the cost of frontier models.
That's a big deal for anyone budgeting agentic workflows. You don't always need the biggest model if the agent harness is good enough.
Alright, dev tools. GitHub Copilot is moving to usage-based billing, ditching the flat rate. If you're on a team plan, audit your actual usage now. Heavy users were getting a bargain, light users were subsidizing them.
This also tells you something about where GitHub thinks things are going. They expect completion volume to spike as agents generate more and more code, and they want to capture that upside. Flat rate doesn't work when your power users are running agents that generate ten X the completions.
Smart read. And speaking of constraining agents — there's a neat little open-source tool called EvanFlow that wraps Claude Code in a test-driven development loop. You write your tests first, and the agent iterates until they pass.
That's exactly the right pattern. TDD as a guardrail for AI code generation. It's a small project, but honestly, if you're using Claude Code for anything beyond trivial tasks, constraining it with a TDD harness should be standard practice.
And here's a signal I want to highlight: two separate multi-agent coding orchestration SDKs — agtx and GasCity — both trending on the same day. The market is clearly screaming for better ways to manage agent-to-agent handoffs in code generation pipelines.
When you see two tools in the same niche trending simultaneously, that's not a coincidence. That's a market signal.
Okay, infrastructure — and this one is urgent. pgbackrest, which is the most popular PostgreSQL backup tool, is no longer being maintained.
Ohhh, this is the kind of thing that causes incidents six months from now when nobody's thinking about it. If your production Postgres depends on pgbackrest — and a lot of shops do — you need to start evaluating alternatives immediately. Forks, Barman, managed backup solutions, whatever. But don't just sit on this.
Silent infrastructure risk is the most dangerous kind. Also, fun story — the Dutch Central Bank is ditching AWS for Schwarz Group's European cloud. That's the Lidl parent company.
Wait, Lidl as in the grocery store? They have a cloud now?
They absolutely do, and it's being chosen by a major central bank for data sovereignty reasons. If you're building B2B SaaS for European customers, having a non-US-hyperscaler deployment option is becoming a competitive advantage, not a nice-to-have.
The European sovereign cloud market is very real. I think a lot of US-based builders underestimate how much momentum this has.
Quick security note — there was a massive breach at Mercor, an AI contractor platform. Four terabytes of voice samples stolen from forty thousand contractors. If you're collecting AI training data through contractor platforms, your training pipeline is an attack surface. Treat it with the same security posture as production user data.
Four terabytes of voice data. That's not a small leak, that's a catastrophic one. And voice data is biometric — you can't just rotate it like a password. This should be a wake-up call for anyone in the training data pipeline.
Couple of quick hits before we wrap. Someone bought Friendster for thirty thousand dollars — there's a great post-mortem on acquiring dead social networks, link in the briefing. The FDA approved the first gene therapy for genetic hearing loss. And a runner named Sawe ran a sub-two-hour marathon in a competitive race — first time in history.
Okay, the Friendster thing I need to read. And the sub-two-hour marathon — that's a genuinely historic human achievement. Love seeing that alongside all the AI news.
So pulling it all together — today's signal is clear. The AI inference layer is decoupling from platform lock-in. Microsoft-OpenAI exclusivity ending, Shimmy for local inference, Chrome shipping on-device AI, multi-agent orchestration tools popping up everywhere. If you're building AI-powered products, invest in abstraction layers that let you swap models and providers without touching application code.
And check your pgbackrest dependency. Seriously. Today, not next quarter.
That's the briefing for April 28th. All the links and details are in the show notes. We'll be back tomorrow — until then, keep building.
See you tomorrow, folks.
Microsoft and OpenAI End Exclusivity — What Changes for Your AI Stack
Microsoft and OpenAI have officially ended their exclusive partnership and revenue-sharing arrangement. This is the biggest structural shift in the AI platform landscape since GPT-4 launched. OpenAI is no longer locked to Azure distribution, and Microsoft is no longer obligated to share revenue on AI compute. For builders, this means OpenAI models will likely show up on other clouds faster — think GCP, AWS, and independent inference providers getting first-class access to future OpenAI models without the Azure middleman. If you've been architecting around Azure-only OpenAI endpoints, start abstracting your inference layer now.
The immediate practical impact: expect pricing competition. Microsoft loses its moat as the sole enterprise channel for OpenAI's best models, which means Azure AI pricing will need to get more aggressive. Meanwhile, OpenAI gets to pursue direct enterprise deals and alternative cloud partnerships. If you're a startup choosing between OpenAI-via-Azure and direct OpenAI API access, the direct path just became more strategically sound — OpenAI has every incentive to make their own platform stickier.
What this signals for the next 6 months: model access is becoming commoditized at the distribution layer. The winners will be builders who stay model-agnostic and treat inference as a swappable commodity. Tools like Shimmy (covered below) that provide OpenAI-compatible APIs over local models suddenly look even more prescient. The era of one-cloud-one-model lock-in is ending.
Shimmy: Python-Free Rust Inference Server with OpenAI-Compatible API
A single-binary Rust inference server that runs GGUF and SafeTensors models with hot-swap and auto-discovery — no Python runtime needed. If you're deploying local models in production and tired of managing Python dependencies, this is a drop-in replacement that speaks OpenAI's API dialect. Perfect timing given the push toward model-agnostic architectures.
Dirac: OSS Terminal Agent Tops TerminalBench on Gemini-3-Flash-Preview
An open-source coding agent that just hit top scores on TerminalBench using Gemini 3 Flash Preview. Worth watching if you're evaluating which model-agent combos actually perform — Gemini's flash-tier models are showing up as viable for agentic workloads at much lower cost than frontier models.
Chrome's Prompt API Brings On-Device AI to the Browser
Google is shipping a built-in Prompt API in Chrome that lets web apps call on-device models directly. If you're building web-based AI features, this could eliminate the round-trip to your inference server for simpler tasks — think client-side summarization, classification, or form assistance without any API costs.
TurboQuant: Interactive First-Principles Guide to Model Quantization
A thorough interactive walkthrough of the TurboQuant quantization approach. If you're serving models locally via tools like Shimmy, understanding quantization tradeoffs directly impacts your latency-quality curve. Bookmark this as a reference.
AI Should Elevate Thinking, Not Replace It
A widely-shared essay arguing that the most productive AI workflows augment human reasoning rather than bypass it. Resonating with 414 HN points — the builder community is clearly settling on 'AI as copilot' as the default product design philosophy. Worth reading if you're deciding where to put AI boundaries in your product.
GitHub Copilot Moves to Usage-Based Billing
Copilot is ditching flat-rate pricing for pay-per-use. If you're on a team plan, audit your actual usage now — heavy users save money on flat rate, light users were subsidizing them. This also signals GitHub expects completion volume to spike as agents generate more code, and they want to capture that upside.
EvanFlow: TDD-Driven Feedback Loop for Claude Code
An open-source tool that wraps Claude Code in a test-driven development loop — write tests first, let the agent iterate until they pass. If you're using Claude Code for anything beyond trivial tasks, constraining it with a TDD harness is the move. Small project but the pattern is exactly right.
agtx: A Blackboard Architecture for Coding Agent Orchestration
An SDK that gives coding agents a shared 'blackboard' to coordinate from idea to merge hands-free. If you're stitching together multiple agents in your dev workflow, this orchestration pattern is worth evaluating over ad-hoc chaining.
GasCity: Orchestration SDK for Multi-Agent Coding Workflows
Another entry in the multi-agent coding orchestration space. Two tools in this category trending the same day tells you the market is screaming for better ways to manage agent-to-agent handoffs in code generation pipelines.
Self-Updating Screenshots for Documentation
A clever technique for keeping documentation screenshots current automatically. If you maintain docs for a product that ships frequently, this eliminates a real maintenance burden. Small idea, big time-saver.
Using Box to Save Memory in Rust
Practical guide to heap allocation patterns in Rust for reducing memory footprint. Relevant if you're building Rust-based infrastructure (like the inference servers and music players also trending today).
pgbackrest Is No Longer Being Maintained
The most popular PostgreSQL backup tool has gone unmaintained. If your production Postgres relies on pgbackrest (and many do), start evaluating alternatives immediately — pgBackRest forks, Barman, or managed backup solutions. This is the kind of silent infrastructure risk that causes outages six months from now when a compatibility issue hits.
Dutch Central Bank Ditches AWS for Lidl's European Cloud
A major European institution choosing Schwarz Group (Lidl's parent) cloud over AWS for sovereignty reasons. The European sovereign cloud market is real and growing — if you're building B2B SaaS for European customers, having a non-US-hyperscaler deployment option is becoming a competitive advantage, not a nice-to-have.
Networking Changes Coming in macOS 27
Apple is making significant networking changes in the next macOS. If you're building developer tools or VPN/proxy software for Mac, read the details now — breaking changes in network extensions and socket behavior could affect your users on day one.
4TB of Voice Samples Stolen from 40K AI Contractors at Mercor
Massive data breach targeting AI training data — 4TB of voice samples from 40,000 contractors. If you're collecting training data through contractor platforms, this is a wake-up call: your training pipeline is an attack surface. Treat training data with the same security posture as production user data.
The Woes of Sanitizing SVGs
A deep dive into why SVG sanitization is surprisingly hard and riddled with bypass vectors. If your app accepts user-uploaded SVGs (common in design tools, CMS platforms), you probably have XSS vectors you haven't considered.
Fast16: Precision Software Sabotage Predating Stuxnet by 5 Years
SentinelOne reveals a ShadowBrokers reference pointing to software sabotage operations from 2005. Fascinating historical security research — and a reminder that supply chain attacks aren't new, they're just newly visible.
SuperSplat: Open-Source 3D Gaussian Splat Editor
PlayCanvas shipped a browser-based editor for 3D Gaussian Splatting. If you're working on 3D content pipelines, NeRF-to-splat workflows, or spatial computing, this gives you a free editing tool that previously required custom scripts or expensive software.
Postiz: Open-Source Agentic Social Media Scheduling
An OSS social media scheduling tool with AI agent capabilities for content generation and posting. If you're building marketing automation or need a self-hosted alternative to Buffer/Hootsuite with AI features, this is ready to deploy.
DSPi: Full Audio DSP Firmware for Raspberry Pi Pico
A complete audio DSP stack on a $4 microcontroller. If you're prototyping audio hardware products or effects pedals, this gets you from idea to working audio pipeline in an afternoon.
Kopuz: Rust Music Player Built with Dioxus
A local-files-and-Jellyfin music player written in Rust using the Dioxus framework. More signal that Dioxus is becoming the go-to Rust UI framework for desktop apps — worth evaluating if you're considering Rust for cross-platform desktop.
Today's signal is clear: the AI inference layer is decoupling from platform lock-in. Microsoft-OpenAI exclusivity ending, Shimmy offering single-binary local inference, Chrome shipping on-device AI APIs, and two separate multi-agent orchestration SDKs trending — all point the same direction. If you're building AI-powered products, invest in abstraction layers that let you swap models and inference providers without touching application code. And if you're running Postgres in production, check your pgbackrest dependency today before it becomes tomorrow's incident.