[>>] S1E18June 11, 202639:23

AI Didn't Invent These Problems

Tim Williams (host)Paul Mason (host)

0:00

39:23

Now playing:Introduction

Chapters

Show Notes

Tim and Paul break down Anthropic's Fable 5 pricing disconnect, the dav1d assembly decoder that outraced higher-level implementations, and why Agile's 2001 playbook stumbles when agents build apps in hours. They critique the hype around autonomous agent loops, highlighting the real constraints—budgets, tests, and decision quality—that determine whether AI accelerates value or just incinerates tokens. It's a tight hour on the shifting boundaries of craft, process, and the problems AI reveals but can't solve on its own.

Transcript

Tim Williams: It's Rubber Duck Radio Episode 18, I'm Tim Williams and with me as always is Paul Mason. Paul Mason: Hey Tim, let's jump into it! Tim Williams: So Anthropic dropped Fable 5 this week. [pause] And I'm gonna say something that might sound ungrateful — [inhale] but I genuinely don't know who this model is for. Paul Mason: [chuckle] Yeah, I was wondering how long it'd take you to get there. Tim Williams: I mean — [exhale] here's the thing. It's a Mythos-class model. It's powerful. The benchmarks are real. [short pause] But it's priced at ten dollars per million input tokens and fifty dollars per million output. [pause] That's [emphasis] double Opus 4.8. Double. Paul Mason: Triple Sonnet 4.6, actually. Tim Williams: [pause] Right. So I'm sitting here, looking at my workflow, asking — [inhale] where do I slot this in? What task am I doing where I go, you know what, [emphasis] this is the moment Fable 5 earns its keep? And I keep coming up empty. Paul Mason: Same here, man. And I've actually been using it. Like, [chuckle] extensively. Burned through my Max plan credits way faster than I expected. Tim Williams: Okay so that's actually what I want to get into — [pause] because you've been hands-on with it. What's the real experience been? Paul Mason: So here's my honest take. [short pause] It doesn't feel like a step change. [pause] I wanna be fair — it's good. It's really good. But if you didn't tell me this was a Mythos-class model, [tsk] I don't think I'd know. It just feels like another Opus launch. Tim Williams: [surprised] Really? Even with the whole Mythos lineage? Paul Mason: Yeah. And look — [inhale] I don't have evidence behind this yet. I'm not doing a formal benchmark audit. But day to day, [short pause] I'm not hitting a moment where I go, [emphasis] Opus could never have done that. It's more like — [pause] this is slightly cleaner. Slightly faster. Slightly more surgical with the diffs. Tim Williams: And for double the price, [emphasis] slightly doesn't cut it. Paul Mason: [chuckle] That's exactly my point. [inhale] The per-token cost makes it this weird thing where — [short pause] even if it's better, you're constantly doing mental math. Like, was that prompt worth fifty cents? Was that response? Tim Williams: And this is what bugs me. Not that the model exists — [inhale] I'm glad they're pushing the frontier. But the pricing tells me this is a production model. They want you building on it. [pause] And yet I can't figure out the use case where the economics make sense. Paul Mason: [uncertain] You know what I keep wondering — [pause] what capability does this actually unlock for a developer who already knows what they want? Tim Williams: [pause] Say more about that. Paul Mason: So, if I'm building something — if I have a clear vision, a clear architecture in my head — [short pause] Opus 4.8 already gets me there. Like, it writes good code. It understands my intent. [tsk] Fable 5 might do it in fewer turns, sure, but [emphasis] I'm not blocked on Opus. I'm not sitting there thinking, if only my model were smarter. Tim Williams: That's a [emphasis] huge point. [inhale] Because the narrative on Twitter right now is — [pause] this is the one. Mythos for the masses. The model that changes everything. And you're saying, in practice, it's just... another Opus. Paul Mason: [chuckle] I mean, I wouldn't go that far. It's better in specific ways — [short pause] the diffs really are more targeted. It hallucinates less on long context. There were a couple moments where it caught things Opus missed. [pause] But — [exhale] yeah, the gap between the Twitter hype and the daily experience is [emphasis] real. Tim Williams: [inhale] And this is where I get frustrated with the whole pricing model. Because if the value prop is [emphasis] fewer turns, fewer tokens — [short pause] great. But you're charging double per token, so you're eating your own efficiency gains. The math doesn't work unless the task genuinely [emphasis] requires Fable-level reasoning. Paul Mason: Yeah. And I haven't found that task yet. [pause] I've thrown some gnarly stuff at it — multi-file refactors, cross-cutting concerns, stuff where Opus gets confused — [short pause] and Fable handles it. But so does Opus, nine times out of ten. Just a little slower. Tim Williams: [pause] So let's talk about the guardrails for a second. Because that's the other thing lighting up Reddit and Hacker News. Paul Mason: [tsk] Ugh. Yeah. [inhale] So the thing is — Fable 5 apparently has these safety measures inherited from Mythos. And in practice — [short pause] it's refusing things that are completely mundane. Tim Williams: I saw someone on Reddit saying it blocked a request to refactor existing code. [pause] Just — refactor. Existing code. Paul Mason: [chuckle] Yeah, and someone else got flagged for asking about Hermitian matrices. [short pause] Hermitian matrices. A concept in linear algebra. The model switched them to Opus 4.8 for [emphasis] security reasons. Tim Williams: [laughing] I'm sorry — [pause] a math concept got flagged for security? [exhale] That's not a guardrail. That's a bug with a press release. Paul Mason: [laughing] Right? [inhale] And look, I get why Anthropic's being careful. Mythos genuinely spooked people. The government was involved. But — [pause] if you're gonna release something to the public and charge double, [emphasis] it has to work for the public. Tim Williams: [pause] So here's the question I keep coming back to. [inhale] Is Fable 5 a product — or is it a demo of what Mythos can do, with training wheels bolted on? Paul Mason: [short pause] That's... actually a really good way to frame it. [inhale] Because if it's a demo, the pricing makes sense. You're not supposed to build on it. You're supposed to try it, be amazed, and — [pause] wait for the real thing. Tim Williams: But the API is open. They're charging production prices. They're telling enterprises to build on it. [exhale] So which is it? Paul Mason: [doubtful] I don't think they know yet. [short pause] I think Anthropic is in this weird position where — they have this genuinely powerful technology, but the safety constraints and the compute costs are eating them alive. [inhale] So they price it high, gate it behind guardrails, and hope the market tells them what to do next. Tim Williams: [pause] You know, Paul — [inhale] there's actually something bigger here that I've been thinking about. And it's not just about Fable 5's pricing. Paul Mason: [chuckle] Uh oh. Here we go. Tim Williams: [laughing] No, no — hear me out. [inhale] The hardest problems in agentic AI right now — [pause] they have an almost [emphasis] identical shape to the hardest problems we had in software engineering before AI showed up. Paul Mason: [short pause] Yeah. [inhale] Focus, context management, complexity management — Tim Williams: [interrupting] Exactly. [emphasis] Exactly. AI didn't invent these problems. It just — [exhale] — shined a flashlight on them. Made them impossible to ignore. Paul Mason: And here's the irony. [tsk] We're responding to this by adding — [short pause] another layer. The AI layer. The agent layer. On top of everything else. Tim Williams: [sigh] Yeah. We looked at a complexity problem, and our solution was — [pause] add another meta layer. Which is — [chuckle] — itself a new form of complexity. Paul Mason: Totally. And look, there's a real benefit to being able to work at that higher level. No question. [inhale] But we've seen this movie before. Every generation of software engineering adds a new layer of abstraction and says — [emphasis] this time — the lower layers are dead. Tim Williams: Right. And they're [emphasis] never dead. [pause] C didn't kill assembly. Java didn't kill C. Python didn't kill Java. What actually happened is — new people entered the industry who work at higher layers, [emphasis] relying on people who still work at the lower ones. Paul Mason: [chuckle] You know what this reminds me of? [inhale] There's this project called david — [short pause] that's D-A-V-1-D, like "David" but with a one in the middle, it's a play on AV1. The VideoLAN team — same folks behind VLC and a lot of FFmpeg — built this AV1 decoder almost entirely in hand-written assembly. Tim Williams: [surprised] Wait — [pause] hand-written assembly? For a [emphasis] modern video codec? Paul Mason: [excited] And they built david — hand-tuned AVX2, AVX-512, NEON on ARM — and it was [emphasis] two to four times faster than libaom. [inhale] Netflix actually funded part of the NEON work for 10-bit decoding on mobile. And it ended up [emphasis] everywhere — VLC, Firefox, Chrome, Android, Windows, FFmpeg itself. The reference implementation lost. Assembly won. Tim Williams: [pause] So the team that built the reference decoder in C — [emphasis] the official one, from the standards body — got absolutely beaten by a handful of people writing assembly by hand. [inhale] That's not a minor optimization. That's proof the lower layers aren't dead. They're where the [emphasis] real performance lives. Paul Mason: And here's where AI fits into this — [pause] awkwardly. [tsk] AI is [emphasis] bad at writing low-level code. Like, genuinely bad. Tim Williams: [sigh] Yeah. And the reason's probably straightforward — there's just not enough assembly in the training data. Not enough low-level, hand-optimized, craft-level code for these models to learn from. Paul Mason: Right. So the thing we're adding on top — this AI meta-layer — [pause] it can [emphasis] only help you at the layers where it has training data. Which is overwhelmingly — high-level, popular-language, abundant-code territory. [inhale] It doesn't touch the bottom of the stack. Tim Williams: [pause] So here's the uncomfortable conclusion. [inhale] AI has made us [emphasis] more productive at high-level work. But it's also revealed — in stark, undeniable terms — that the stack still depends on people who can go low. [short pause] And AI can't follow them there. Paul Mason: Yeah. [pause] And if you're a developer who's only ever worked at the top of the stack — [inhale] you might not even [emphasis] see the lower layers until they break. And by then, AI won't save you. Tim Williams: [pause] So all of that — the low-level code, the assembly that still beats everything, the layers that never actually die — [inhale] it makes me think about something adjacent. [pause] We've been talking about how [emphasis] tools have changed. But the [emphasis] process — how we organize work, how we plan, how we track — is basically frozen in time. Paul Mason: [chuckle] Oh, you mean the two-week sprint that was designed when the hardest part of software was coordinating humans typing code? Tim Williams: [laughing] Yes. [pause] Exactly that. So there's this guy — Steve Jones, executive VP at Capgemini — he published a piece that lit a fire under the industry. Title: [emphasis] "AI Killed the Agile Manifesto." Paul Mason: Yeah, I read that. [tsk] His core argument is pretty hard to argue with — he's building working applications in [emphasis] hours. Migrating entire apps on single flights. And Agile's third principle is literally "deliver working software frequently, from a couple of weeks to a couple of months." [pause] That timescale is [emphasis] absurd now. Tim Williams: [pause] But here's where it gets interesting. The reaction wasn't unanimous. Sonya Siderova — she's the CEO of a company called Nave — she said something I think is [emphasis] more useful than Jones's headline. She said: [pause] "Agile isn't dead. It's optimizing a constraint that moved." Paul Mason: [pause] That's actually — [short pause] that's a really clean way to put it. The constraint [emphasis] was "how do humans collaborate to build things." Standups, retros, story points — all built around that. [inhale] But when agents can build in minutes, the constraint moves to [emphasis] "how do humans decide what to build and validate it actually works." Tim Williams: Right. And if the constraint moved — [pause] then the unit of work is wrong. Stories as we write them, sprints as we plan them — they become [emphasis] liabilities. A story that was sized for a human working for three days gets [emphasis] obliterated by an agent in twenty minutes. And then what do you do? Paul Mason: [chuckle] You sit there in standup saying "uh, yesterday I reviewed a lot of AI-generated code, today I'll review a lot more." [pause] Which, by the way, Jones makes another point that I think is [emphasis] genuinely important. He says documentation matters [emphasis] more in an agentic workflow, not less. Tim Williams: [pause] Yes. This is the part I keep coming back to. [inhale] The Agile Manifesto says "working software over comprehensive documentation." That made sense when creating working software was [emphasis] expensive — when it was the hard part. But Jones's point is — AI makes working software [emphasis] trivially easy to produce. It can generate code that [emphasis] looks like it works. The hard part is knowing whether it [emphasis] actually works — whether it's maintainable, whether it's correct at the edges, whether it has technical debt. [short pause] Documentation becomes the thing you hold the code [emphasis] against. Paul Mason: Yeah — and Jones goes even further on the manifesto itself. The first value? "Individuals and interactions over processes and tools." He says that's [emphasis] exactly backwards for agentic workflows. [tsk] The tools [emphasis] absolutely matter now. Whether you're on Replit versus Claude Code versus Copilot — that choice determines [emphasis] everything about how your project works. Tim Williams: [pause] That's uncomfortable. [short pause] But he's right. I mean — [exhale] if you're building an enterprise app with Claude Code and a team of agents, you [emphasis] cannot be tool-agnostic. The process [emphasis] is the tool now. Paul Mason: So here's where the industry's trying to catch up. [inhale] Microsoft published something called Agentic-Agile — Daniel Epstein over there. The idea is — [short pause] stop starting with prompts. Start with [emphasis] specs. Every capability is an issue. Every issue has acceptance criteria. The agent doesn't get a vague prompt — it gets a [emphasis] contract. Tim Williams: [pause] Yeah, and I think that's the right direction. Derek Ashmore over at CIO made a related point — [inhale] in agentic engineering, [emphasis] everyone on the team becomes a specifier. The role shifts from "I write the code" to "I define what the code should do." Stories get bigger. Concurrency management becomes about making sure agents don't step on each other. Paul Mason: And testing — [pause] end-to-end testing becomes [emphasis] way more critical. Because agents are faster but they have a higher likelihood of introducing problems that only show up at the system level. They don't have the same contextual understanding a human developer builds up over months on a codebase. Tim Williams: [pause] So this brings me to Kent Beck. [inhale] One of the [emphasis] original Agile Manifesto signatories. And he's been writing about what he calls "augmented coding" versus "vibe coding." [short pause] Vibe coding is — you feed errors back to the AI, hope for fixes, don't really care about code quality. Augmented coding is — [emphasis] you keep engineering rigor. Clean code, comprehensive testing, careful design. The AI handles the typing. You handle the [emphasis] judgment. Paul Mason: [pause] And here's the quote that stuck with me. Beck said — [short pause] "90% of my skills are now worth zero dollars. [pause] But the other 10% are worth a thousand times more." [inhale] That 10% is judgment. Architecture. Taste. The stuff you can't prompt your way into. Tim Williams: [pause] That's [emphasis] the thing. [inhale] The same way we discovered that higher-level abstractions didn't kill assembly — they just added more people at the top while the bottom was still there — [pause] AI is not going to kill process. It's going to reveal which parts of process were [emphasis] real and which parts were just coordinating slow human typing. Paul Mason: Yeah. And the parts that were real — [short pause] spec-writing, acceptance criteria, governance, architecture — those get [emphasis] more important, not less. [tsk] Epstein's team learned this the hard way. They deferred CI/CD to later stories. By the time they added it, they'd already built on assumptions that had never been validated. Had to reopen and redevelop features. Tim Williams: [sigh] That's the pattern, isn't it? [pause] Governance from day one — not bolted on after the mess is made. [inhale] And here's a stat that haunts me. Forrester's State of Agile report — 95% of professionals say Agile is critical to their operations. [short pause] But only 7% achieve full proficiency. Paul Mason: [chuckle] So we had a 7% success rate [emphasis] before agents. [pause] And now we're adding AI into the mix — moving faster, generating more code, with less human review time. [inhale] That gap between "we think this matters" and "we're actually good at it" — [short pause] it's about to get a lot wider. Tim Williams: [pause] So the question isn't "is Agile dead." [short pause] The question is — [inhale] are we willing to rebuild our process for a world where the building happens in minutes and the [emphasis] deciding is the only thing that can't be automated? [pause] Because right now, the process is the bottleneck — and nobody's sprint planning their way out of that. Paul Mason: [short pause] Speaking of process that's broken — [pause] did you see Steinberger's tweet this week? Tim Williams: [sigh] Oh, I saw it. [pause] "Here's your monthly reminder that you shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents." [tsk] Six and a half million views. Paul Mason: Yeah, that one. And look — [short pause] I want to be fair here. Steinberger is not a nobody. OpenClaw hit a hundred eighty thousand GitHub stars in three months. He's now at OpenAI. He runs a hundred Codex instances maintaining open source repos. [pause] The guy has [emphasis] earned an opinion. Tim Williams: He has. And I want to take his argument seriously, because there's [emphasis] something in it. [inhale] The idea that we should stop being the thing inside the loop, hand-typing prompts — and instead design the system that prompts the agent — [short pause] that's directionally [emphasis] correct. Boris Cherny at Anthropic is saying the same thing. Karpathy's saying it. Paul Mason: Right. But — [short pause] here's where I think the tweet goes off the rails. [inhale] The agent [emphasis] is already a loop. ReAct — reason, act, observe, repeat — that pattern has been the standard since 2022. Every modern agent framework is built on it. Claude Code. Codex. LangGraph. [pause] It's all loops. Tim Williams: [pause] And this is the irony that I cannot get past. [inhale] Steinberger's own project — OpenClaw — has proven that AI agents will [emphasis] happily churn tokens even when they're not making improvements. [short pause] He posted the receipts himself. One point three million dollars in API tokens in a single month. Six hundred three billion tokens across seven point six million requests. Paul Mason: [laughs] Twenty thousand dollars a day on the peak. [exhale] And he called ralph loops — the simplest possible while-loop around a Claude call — [emphasis] "the ultimate token burn machine." [pause] Which is [short pause] exactly the problem. Tim Williams: [emphasis] Exactly. [inhale] So here's the question — [pause] if you don't have very well defined, measurable goals for your agents, what is the loop [emphasis] doing? It's not converging. It's not iterating toward anything. It's just [short pause] spinning. Burning tokens. The agent agreeing with itself on repeat. Paul Mason: [tsk] And the best reply in the whole thread came from a guy named Mosyaseen. He said — [short pause] "designing the loop is half of it. The other half is putting something in the loop that can say no: a test, a type check, a real error." [pause] A loop with nothing to push back is the agent [emphasis] agreeing with itself on repeat. Tim Williams: [pause] So let's actually be constructive here. Because there [emphasis] is a real pattern that works. [inhale] If you have well defined, measurable goals — the proper use case is [emphasis] agent orchestration. An orchestrating agent spins up sub-agents to complete individual tasks. Each sub-agent has a specific, bounded job. The orchestrator tracks progress against the goal. Paul Mason: Yeah. And as long as the orchestrator is running inside a harness that does auto-compaction — [short pause] compressing the context window so it doesn't blow up over time — it can keep going until the goal is actually complete. [pause] All the major harnesses have this now. Claude Code does it. Codex does it. It's table stakes. Tim Williams: And there's also a use case for [emphasis] scheduling agents to spin up and work on something at future intervals. [short pause] Cron plus a decision-maker in the body. That's real. That ships. But — [pause] that is not a loop. That's a scheduled job with an LLM call in the middle. [tsk] Different thing entirely. Paul Mason: [chuckle] Cronjobs have funny rebranding right now. [pause] That was actually the best skeptic line in the whole discourse. Somebody said — [short pause] "cronjobs have funny re-branding rn." [exhale] Half right. But the real pushback is simpler. Paul Mason: [pause] Steinberger's advice feels [emphasis] completely out of touch with the reality of software development. [short pause] Most teams don't have unlimited OpenAI tokens on their employer's tab. Most teams aren't maintaining thirty open-source repos with a hundred agents. [inhale] Most teams are trying to ship features and not [emphasis] burn seven figures in API costs. Tim Williams: [pause] And I think that's where I land. [inhale] Maybe Steinberger is talking about a use case that makes perfect sense for what [emphasis] he's building — [short pause] autonomous open source maintenance at scale, with budget literally not a constraint. That's a real problem. That's a real solution. Tim Williams: But as [emphasis] general advice for software developers? For vibe coders? For teams trying to ship products? [pause] Telling them to stop prompting and "design loops" without first telling them — [inhale] your loop needs a goal, your loop needs a test that says no, your loop needs a budget cap, and your agent is [emphasis] already looping — [short pause] that's not advice. That's a subtweet from an alternate universe where tokens are free. Paul Mason: [chuckle] The Reddit crowd had the sharpest version of this. [pause] "Monthly reminder that you need to orchestrate multiple agents and loops so they can blow your money generating bugs without you behind the wheel." [exhale] It's mean. But it's not [emphasis] wrong. Tim Williams: [laughs] It's mean, but it's not wrong — [pause] that might be the subtitle for this entire episode. [exhale] Here's the moral of this one. [inhale] Loops are not the product. The product is [emphasis] what you put inside the loop — the goals, the tests, the constraints, the budget. [pause] And if you skip that part, you're not doing loop engineering. You're doing [short pause] token cremation. Tim Williams: [pause] So here's the thread running through this whole episode. [inhale] Fable 5 — double the price for a model that doesn't feel like a step change. The assembly codec that beat every higher-level competitor. Agile processes frozen in 2001 while AI writes code in seconds. And a guy with a hundred agents telling the rest of us to stop prompting and design loops. [pause] What connects all of this? [short pause] AI isn't solving the hard problems. It's [emphasis] revealing them. Paul Mason: Yeah. [pause] Every segment was about a constraint that either moved or got ignored. [tsk] Fable 5 — Anthropic ignored the budget constraint for most teams. The dav1d story — the constraint was raw performance, and hand-tuned assembly still wins. Agile — the constraint moved from coordination speed to decision quality, and most teams haven't noticed. [chuckle] And Steinberger — he ignored the [emphasis] goal constraint entirely. Loops without measurable goals aren't engineering. They're gambling. Tim Williams: [pause] That's exactly it. The teams that win in this next phase aren't the ones with the most agents or the fanciest loops. [inhale] They're the ones asking — [emphasis] what's the constraint right now? Is it budget? Is it latency? Is it decision quality? Is it craft? [short pause] AI accelerates everything. Including your failure to define what you actually want. [pause] The flashlight is on. What you see depends on where you point it. Tim Williams: [pause] That's our show for this week. Thanks to Paul Mason for riding shotgun through pricing math, assembly language victory laps, Agile existential crises, and token cremation. [chuckle] Find us and subscribe wherever you get your podcasts, and if you've got thoughts on Fable 5 — or if you're one of the seven percent actually doing Agile well with agents — we want to hear from you. [pause] I'm Tim Williams. [short pause] Here's looking at you.

Related Projects

AI Charts

AI-powered flowchart, ERD, and swimlane diagram builder with a built-in AI assistant and an MCP server exposing 18+ tools for external AI integration. Works with any OpenAI-compatible LLM — no vendor lock-in.

Solo DeveloperView project ->

AI Sound

AI-native audio editor built as a modern replacement for Audacity, with LLM integration at its core. Features multi-track editing, AI transcription, speaker diarization, semantic search, and a full MCP server for external AI assistant integration.

Solo DeveloperView project ->

GTZenda

Enterprise document intelligence pipeline that ingests procurement data from AI agents, classifies and normalizes documents using LLM processing, and pushes structured data into a government sales intelligence platform. Built on AWS with SQS-driven async processing and OpenAI integration.

Lead DeveloperView project ->