WEBVTT
NOTE The Rundown — nextbig.dev daily audio edition, 2026-05-25

1
00:00:00.000 --> 00:00:08.490
<v Alex>Good morning! Welcome to Builder's Briefing for May twenty-fifth, twenty twenty-six. I'm Alex, joined as always by Sam, and today we've got a packed show — Anthropic open-sourcing their Cowork plugin architecture, a really important paper on how AI agents silently break your code over time, and a bunch of new dev tools worth knowing about.

2
00:00:08.490 --> 00:00:12.971
<v Sam>Yeah, there's a clear theme today — the AI toolchain is splintering into all these specialized composable pieces, and honestly it's exciting if you're a builder. Let's get into it.

3
00:00:12.971 --> 00:00:21.983
<v Alex>So the big story — Anthropic just dropped an open-source repo called knowledge-work-plugins. It's the plugin architecture for Claude Cowork, their collaborative AI workspace. This isn't a new model, it's an infrastructure play. They're giving developers an extensibility surface to build custom plugins that wire Claude into your team's specific knowledge stack.

4
00:00:21.983 --> 00:00:29.178
<v Sam>Right, and what's wild is this feels like Anthropic's answer to the ChatGPT plugin store, except they're going straight for enterprise knowledge work instead of consumer novelty. Think Confluence connectors, internal wiki search, CRM summarizers — the boring but incredibly valuable stuff.

5
00:00:29.178 --> 00:00:34.979
<v Alex>Exactly. And the signal here is clear — Anthropic is betting the moat isn't just model quality, it's workflow integration. If you're building AI-powered productivity tools or internal copilots, you want to study this plugin spec now.

6
00:00:34.979 --> 00:00:41.154
<v Sam>Early movers are going to have a real distribution advantage when Cowork scales across enterprise customers. If you've been building internal tools on top of Claude, this is the moment to clone that repo and start prototyping. Link in the briefing.

7
00:00:41.154 --> 00:00:47.876
<v Alex>Alright, moving to AI and models — two stories I want to hit. First, DeepSeek shipped Reasonix, a coding agent that leans hard on caching to keep inference costs down. If you're running agentic coding loops and just burning through tokens, this one's worth benchmarking.

8
00:00:47.876 --> 00:00:53.726
<v Sam>That's interesting because the cost problem with agentic coding is real. You run these multi-step loops and the token bill just explodes. A cache-first approach for repetitive codegen tasks could genuinely cut your costs significantly.

9
00:00:53.726 --> 00:01:00.896
<v Alex>The other must-read is a new arXiv paper on something they're calling constraint decay. Researchers found that LLM agents progressively violate constraints during multi-step backend code generation — they start strong, then drift. The longer the task, the more they break things silently.

10
00:01:00.896 --> 00:01:07.270
<v Sam>Oh, this one hit home for me. If you're shipping agent-generated backend code to production — and a lot of teams are now — this is a concrete argument for adding constraint-checking middleware after every generation step. Don't just trust the final output.

11
00:01:07.270 --> 00:01:15.237
<v Alex>And one more quick one — Epoch AI's data shows memory now accounts for nearly two-thirds of AI chip component costs. Not compute — memory. So if you're making infrastructure decisions, memory-efficient architectures like quantization and KV-cache optimization give you way more bang for your buck than chasing raw FLOPS.

12
00:01:15.237 --> 00:01:17.851
<v Sam>That's a real mindset shift. Everyone's been obsessed with compute, but the bottleneck has quietly moved.

13
00:01:17.851 --> 00:01:25.768
<v Alex>Okay, dev tools — there's a couple of gems here. Earendil Pi is an open-source full-stack AI agent toolkit that bundles a CLI, web UI, Slack bot, and vLLM pod management into one package. If you're currently stitching together three or four different tools for your AI dev workflow, this might consolidate all of that.

14
00:01:25.768 --> 00:01:30.921
<v Sam>The vLLM pod management piece is what caught my eye. If you're self-hosting inference, managing those pods is a pain, and having that baked into the same toolkit as your coding agent CLI is genuinely useful.

15
00:01:30.921 --> 00:01:36.672
<v Alex>Also worth a look — CCX, a unified API proxy that lets you route requests across Claude, Codex, and Gemini through a single layer. Super handy for multi-model fallback chains or A/B testing providers without touching your app code.

16
00:01:36.672 --> 00:01:41.079
<v Sam>See, this is exactly the composability theme we're talking about. You architect for plug-and-play model switching, and tools like CCX make that practical instead of theoretical.

17
00:01:41.079 --> 00:01:47.527
<v Alex>Quick infrastructure note — AMD Xilinx is removing Linux support from Vivado's free tier. If you're prototyping on FPGAs with a Linux workflow, you'll need to pay up or find alternatives. Classic pattern: free tiers get squeezed once the platform has lock-in.

18
00:01:47.527 --> 00:01:55.817
<v Sam>Ugh, yeah. And there's also a really candid retrospective floating around — someone who spent four years at AWS writing about internal culture and open-source dynamics. If you're evaluating AWS partnerships or building on their services, the internal incentive structures they describe are worth understanding for your platform risk.

19
00:01:55.817 --> 00:02:00.672
<v Alex>On the security front — quick but important. An internal Microsoft account is being exploited to send spam at scale, and it's bypassing typical email filters because the sender domain is trusted.

20
00:02:00.672 --> 00:02:05.154
<v Sam>So if you're relying on sender reputation for your email security, this is your reminder to go check your inbound filtering rules. Even Microsoft's own domains can get compromised.

21
00:02:05.154 --> 00:02:10.581
<v Alex>A few quick hits to close out the news. Green card seekers now must leave the U.S. to apply — that story hit almost eight hundred points on Hacker News and has major implications for tech hiring and workforce planning.

22
00:02:10.581 --> 00:02:15.834
<v Sam>Yeah, that one's going to ripple through every company with H-1B employees. Also loved the story about someone spending fifty hours drawing a single line graph — obsessive data visualization craft at its finest.

23
00:02:15.834 --> 00:02:21.884
<v Alex>And Microsoft open-sourced the earliest known DOS source code — pre-one-point-oh. Fun for retrocomputing nerds, practically irrelevant for shipping products, but a nice reminder that even proprietary giants eventually open-source their legacy.

24
00:02:21.884 --> 00:02:28.581
<v Alex>So the big takeaway today — the AI toolchain is fragmenting into specialized, composable pieces. Anthropic's plugins, Earendil's toolkit, CCX's proxy, DeepSeek's cache-first agent. Stop picking one monolithic provider and start architecting for plug-and-play switching.

25
00:02:28.581 --> 00:02:34.954
<v Sam>And if you're shipping agent-generated code — please read that constraint decay paper. The models will not maintain invariants on their own over multi-step tasks. Add automated constraint validation after every generation step. That's not optional anymore.

26
00:02:34.954 --> 00:02:39.859
<v Alex>That's your Builder's Briefing for May twenty-fifth. All the links are in the show notes. This week's going to be interesting as the ecosystem around these composable AI tools starts to take shape.

27
00:02:39.859 --> 00:02:42.000
<v Sam>Build modular, validate aggressively, and we'll catch you next time. Have a great one!
