WEBVTT
NOTE The Rundown — nextbig.dev daily audio edition, 2026-06-24

1
00:00:04.500 --> 00:00:12.380
<v Oday>Claude went dark for two hours and thirty-seven minutes yesterday, and that's the third week running it's failed at the exact hour US teams ship.

2
00:00:12.380 --> 00:00:26.180
<v Shannon>It's Wednesday, June 24, 2026. Here's the rundown. We've got a provider rationing capacity, China taking the Top500 on CPUs alone, and a 744-billion model running on a Mac.

3
00:00:26.360 --> 00:00:44.520
<v Oday>Anthropic logged elevated error rates starting just after two in the afternoon UTC on Monday. By the close of the day Claude had been degraded two hours and thirty-seven minutes, across the website, the API, the Console, Claude Code, all of it at once.

4
00:00:44.520 --> 00:01:00.040
<v Shannon>And one bad day is noise. I'd let one bad day go. But June 16th every Sonnet and Opus ran near a ten percent error rate, and on the 13th they suspended two models outright. No root-cause writeup for any of it.

5
00:01:00.040 --> 00:01:10.280
<v Oday>The errors are 529 overloads, clustered at peak US hours. Anthropic told Fortune demand has outrun what its infrastructure can serve.

6
00:01:10.280 --> 00:01:23.560
<v Shannon>Right, so that's the tell. A 529 isn't a crash, it's capacity rationing. They're turning people away at the door because the room is full, and the fix is more capacity through Amazon and Google that isn't online yet.

7
00:01:23.560 --> 00:01:35.400
<v Oday>Here's the exposure. A SemiAnalysis count in February put Claude Code at roughly four percent of all public GitHub commits. More than a hundred thirty-five thousand a day.

8
00:01:35.400 --> 00:01:48.920
<v Shannon>So a real slice of the world's merges now leans on one provider that throttles exactly when you're trying to ship. Put Claude in your merge gate and your release cadence inherits their overload curve. I've been paged for less.

9
00:01:48.920 --> 00:01:51.240
<v Oday>What do you do tonight if you're building.

10
00:01:51.240 --> 00:02:04.200
<v Shannon>Stop treating Claude as a hard dependency in any automated path. Put a fallback behind a router so a 529 reroutes instead of blocking the build. The hedge actually landed this week.

11
00:02:04.200 --> 00:02:16.040
<v Oday>Unsloth's day-zero weights for GLM-5.2. A 744-billion open model on a 256-gig Mac, two-bit, MIT licensed.

12
00:02:16.040 --> 00:02:29.320
<v Shannon>Single-digit tokens a second, so it's a backstop, not a swap. But a slow local model that answers beats a fast remote one returning 529. No rate limit, no status page to refresh.

13
00:02:30.000 --> 00:02:34.800
<v Oday>And Azure quietly added Fireworks open-weight serving to Foundry the same day.

14
00:02:34.800 --> 00:02:47.920
<v Shannon>Which is the cleanest version of this: a fallback one config away inside the cloud you already pay. Every outage teaches another team to wire an escape hatch, and nobody rips those out afterward.

15
00:02:48.300 --> 00:03:01.980
<v Oday>China retook the Top500 at 2.2 exaflops, beating El Capitan's 1.8. First machine to sustain over two exaflops of double precision without a single GPU, on a custom 304-core chip.

16
00:03:01.980 --> 00:03:14.220
<v Shannon>And that's the export-control story in one number. The embargoes didn't cap peak compute, they pushed China toward CPU-dense designs that route around the banned parts entirely.

17
00:03:14.220 --> 00:03:22.140
<v Oday>On the power side, Microsoft locked twenty years of gas for datacenters, and Canada lined up ten reactors by 2040.

18
00:03:22.140 --> 00:03:33.500
<v Shannon>Note Canada's strategy has no money attached. It points vaguely at an infrastructure bank. That's a posture, not a build order, with nuclear still thirteen percent of their grid.

19
00:03:33.500 --> 00:03:43.100
<v Oday>But the gas commitment is real concrete. A hyperscaler conceding AI load outpaces the grid and buying baseload on a two-decade contract.

20
00:03:43.100 --> 00:04:04.140
<v Shannon>That's the shift worth holding onto. The constraint on scaling inference is moving from chips to firm electrons, and they're financing it years ahead of the load. Watch CXMT too, the Chinese memory maker filing to IPO. A fourth serious DRAM and HBM supplier changes the cost floor under every buildout.

21
00:04:04.320 --> 00:04:13.520
<v Oday>The GLM-5.2 build we mentioned needs 1.5 terabytes full, but the two-bit drop fits in 239 gigs at about eighty-two percent accuracy.

22
00:04:13.520 --> 00:04:27.320
<v Shannon>Frontier-class open inference on unified memory. The vendor claims parity with Opus and GPT-5.5, and not one of those benchmarks is independent yet. So enjoy it, verify it.

23
00:04:27.320 --> 00:04:37.520
<v Oday>Smaller theme repeating today. A 35-billion model topped a forecasting leaderboard, and a 3-billion model claims it beats Opus 4.5 on reasoning.

24
00:04:37.520 --> 00:04:52.240
<v Shannon>Both self-reported, both arxiv-fresh. But the signal is real enough: on bounded tasks, stop reaching for a frontier model. A 35-billion serves cheaper and faster and you'll never feel the gap.

25
00:04:52.240 --> 00:05:01.520
<v Oday>Artificial Analysis also shipped a speech-to-speech quality index. Twenty-seven voice models ranked on reasoning, latency and price.

26
00:05:01.520 --> 00:05:13.760
<v Shannon>And the split is useful. One model leads conversation, a different one leads tool use. So the right pick depends on whether you're building chat or an agent that actually does things.

27
00:05:13.940 --> 00:05:27.060
<v Oday>Baidu open-sourced an OCR model that reads a whole document in one forward pass. A constant KV cache, 93 percent on OmniDocBench, six points over the DeepSeek baseline, at three billion total.

28
00:05:27.060 --> 00:05:42.180
<v Shannon>That's the genuinely clever bit, swapping decoder attention for a sliding window so the cache doesn't blow up over dozens of pages. Ignore the social posts about beating 235-billion models. The paper doesn't say that.

29
00:05:42.180 --> 00:05:51.460
<v Oday>Oak pitched version control built for agents. Mounts a repo without a full clone, one branch per task so parallel agents don't corrupt a shared git directory.

30
00:05:51.460 --> 00:06:04.660
<v Shannon>A real attempt at the multi-agent merge problem, which is a problem nobody had two years ago. Self-reported benchmarks, sitting at v0.96, but somebody's thinking about the right thing.

31
00:06:04.660 --> 00:06:12.820
<v Oday>And Armin Ronacher wrote that the harness is the new unit of work. The outer loop that supervises and re-queues an agent, not the prompt.

32
00:06:12.820 --> 00:06:23.700
<v Shannon>No numbers in it, but he named what every agent team is converging on. The loop keeps the task alive past where the model says it's done. That's where the durable value sits now.

33
00:06:23.880 --> 00:06:32.280
<v Oday>LastPass says hackers stole support case data through a breach at a vendor, Klue. Second LastPass-linked incident in recent years.

34
00:06:32.280 --> 00:06:41.480
<v Shannon>And support systems hold more sensitive context than teams assume. Your posture is only as strong as your least careful third party.

35
00:06:41.480 --> 00:06:49.160
<v Oday>Separately, Meta left mandatory employee keystroke logs visible companywide after a permissions error.

36
00:06:49.160 --> 00:06:59.480
<v Shannon>That one's the lesson for everyone hoarding telemetry to train on. The more you collect, the bigger the blast radius when a single permission flips the wrong way.

37
00:06:59.660 --> 00:07:05.980
<v Oday>Anthropic shipped Claude Tag, an always-on teammate that ingests your company Slack to build org context.

38
00:07:05.980 --> 00:07:20.300
<v Shannon>Sticky by design. Once Claude holds your institutional memory, switching providers means rebuilding it from scratch. And the timing's almost funny, shipping an always-on tool on the same run of capacity outages.

39
00:07:23.540 --> 00:07:26.260
<v Oday>Quick break — two from the desk.

40
00:07:26.260 --> 00:07:40.340
<v Shannon>One we know well: vote dot direct. If you're on an H O A or a board, it runs your elections digitally — secure, verifiable, no paper, no clipboard in the lobby. Point your council to vote dot direct.

41
00:07:40.340 --> 00:07:49.940
<v Oday>And if this is your ten minutes of A I for the day, get the written edition too. The full wire, free, every morning — leave your email at nextbig dot dev.

42
00:07:54.210 --> 00:08:00.370
<v Oday>The Steam Machine launches today, topping Hacker News at 1,463 points.

43
00:08:00.370 --> 00:08:06.530
<v Shannon>Mistral shipped OCR 4, pulling 281 points the same morning.

44
00:08:06.530 --> 00:08:13.170
<v Oday>a16z led a 34-million Series A in Probook, vertical AI for technician dispatch.

45
00:08:13.170 --> 00:08:25.730
<v Shannon>And AI super PACs dropped 27 million on a single New York local race. Regulation is now a funded campaign, and the rules your stack runs under are being set in contests like that one.

46
00:08:25.730 --> 00:08:32.690
<v Oday>Hamel Husain and Shreya Shankar released a free twelve-talk course on evals and retrieval.

47
00:08:34.070 --> 00:08:47.030
<v Oday>Our call: within 90 days, at least one top-five AI coding tool ships automatic failover that reroutes off Claude on overload errors, and markets it on reliability, not price.

48
00:08:47.030 --> 00:09:01.510
<v Shannon>We're wrong if by September 24th none of Cursor, Claude Code, Copilot, Windsurf or a Cline-tier tool ships documented failover triggered by Anthropic's overloads. That's when it settles.
