WEBVTT
NOTE The Rundown — nextbig.dev daily audio edition, 2026-07-04

1
00:00:04.500 --> 00:00:10.420
<v Oday>It's the Fourth of July, and the most on-theme story on the wire is a benchmark.

2
00:00:10.420 --> 00:00:23.220
<v Shannon>Independence Day, a half-size wire. Here's the rundown: how a cost-per-token number on AMD closes out the week, and what compute independence actually means for what you build.

3
00:00:23.480 --> 00:00:42.440
<v Oday>A post going around this weekend runs GLM 5.2, the open Chinese model that anchored the whole week, on AMD hardware. It measures the one unit that decides where inference lives: performance per dollar. And it keeps getting cheaper.

4
00:00:42.440 --> 00:00:52.840
<v Shannon>The holiday is about independence. The benchmark is about a narrower kind of it: freedom from any single company's chips, and any single company's model.

5
00:00:52.840 --> 00:01:22.280
<v Oday>Read it as the receipt for the week. Nine days ago an open model caught Claude. Since then the price of capable AI fell from every side. Sonnet 5 cut it in software. Etched raised five billion to cut it in silicon. Kimi and GLM walked into the tools developers use. A movement formed around running the models yourself.

6
00:01:22.280 --> 00:01:32.680
<v Shannon>The AMD result closes the loop. Frontier-class inference now runs cheaply on hardware that isn't Nvidia's, which was the piece the whole thesis was missing.

7
00:01:32.680 --> 00:01:48.280
<v Oday>Independence is the honest word, and it's worth saying which kind. Not independence from AI. The week was the opposite, AI getting common enough to run anywhere. This is independence from lock-in.

8
00:01:48.280 --> 00:02:02.280
<v Shannon>And it isn't free. Self-hosting trades a monthly invoice for the work of running the thing yourself. What you buy with it is the standing ability to move between models and between chips without asking anyone's permission.

9
00:02:03.180 --> 00:02:21.420
<v Oday>And the open frontier is multipolar now. China's GLM and Kimi. Europe's Mistral shipped Leanstral 1.5 this weekend. The US labs' own cheap tiers. No single vendor sits astride all of it.

10
00:02:21.420 --> 00:02:35.580
<v Shannon>For anyone building, take this as the week's real instruction. Assume any model or chip you depend on can be throttled, repriced, restricted, or discontinued, because a version of each happened this month.

11
00:02:35.580 --> 00:02:49.100
<v Oday>Design for exit. Put a portability layer between your code and any one vendor. Keep a second model configured. Know what it takes to move your inference onto hardware you own.

12
00:02:49.100 --> 00:02:55.660
<v Shannon>This weekend's AMD benchmark is the proof that the last step on that list costs less than it did on Monday.

13
00:02:56.720 --> 00:03:14.480
<v Oday>To the tape. We moved AMD to a long on this, up from yesterday's watch. Open models running at competitive cost per token on AMD is the independence trade, the one hardware name that gains as inference decouples from Nvidia.

14
00:03:14.480 --> 00:03:30.160
<v Shannon>We're watching Nvidia, low conviction. Training is theirs to lose and they won't lose it soon, but inference is the exposed flank. And Mistral on watch, keeping Europe in the open-model game so no single government can gate the supply.

15
00:03:30.160 --> 00:03:36.080
<v Oday>The tape is the desk's scorecard, not advice.

16
00:03:39.320 --> 00:03:41.320
<v Oday>Quick break — two from the desk.

17
00:03:41.320 --> 00:03:56.360
<v Shannon>One we know well: vote dot direct. If you're on an H O A or a board, it runs your elections digitally — secure, verifiable, no paper, no clipboard in the lobby. Point your council to vote dot direct.

18
00:03:56.360 --> 00:04:08.280
<v Oday>And if this is your ten minutes of A I for the day, get the written edition too. The full wire, free, every morning — leave your email at nextbig dot dev.

19
00:04:12.150 --> 00:04:31.030
<v Oday>Our call: within nine months, at least one major cloud or lab publicly reports serving a top open model on AMD at a lower cost per token than the equivalent Nvidia setup, and AMD’s inference share visibly rises on the back of it.

20
00:04:31.030 --> 00:04:41.910
<v Shannon>What proves us wrong: if by April fourth next year no cloud or lab has reported that AMD cost win, and AMD's inference share hasn't moved.

21
00:04:41.910 --> 00:04:55.990
<v Oday>On a holiday about independence, the most useful kind this week is the plain ability to move. Keep a second model wired in. That's the rundown, and that's the week.