[>>] S1E19June 18, 202638:16

Fable 5 Banned: The Multi-Model Escape Plan

Tim Williams (host)Paul Mason (host)

0:00

38:16

Now playing:Introduction: An AI World On Fire

Chapters

Show Notes

Anthropic launched Claude Fable 5 with huge expectations, only to see the US government order it pulled globally three days later. Tim and Paul dig into the swirling conspiracy theories: was it retaliation for refusing to arm the Pentagon? Did a competitor exploit a jailbreak report to kneecap a rival? And did Anthropic’s own transparency accidentally hand over the rope? Then the conversation pivots to token anxiety, ballooning API costs, and the open-source models like GLM 5.2 and DeepSeek V4 Pro that now rival proprietary giants at a fraction of the price. The episode’s core insight: a three-stage workflow—planning with a flagship model, implementing with a cheap or local one, and reviewing with a third—lets developers escape single-point-of-failure risks and spiraling bills, and it's already taking shape across the coding community.

Transcript

Tim Williams: Hello and welcome! I'm Tim Williams, and this is episode nineteen of the Rubber Duck Radio [short pause] Joining me as always is Paul Mason — full-stack developer from Seattle, my co-conspirator in all things AI and software. [pause] Paul, we have got [emphasis] quite the week to talk about. Paul Mason: Yeah, [inhale] I feel like we picked the right week to record. [short pause] The AI world has been, uh — [pause] absolutely on fire. And not in the good way. Tim Williams: [laughing] Not in the good way at all. [inhale] So let's just jump straight into it. [pause] Claude Fable 5 and Mythos 5 — Anthropic launches them on June 9th to massive fanfare, [short pause] and three days later the US government orders them [emphasis] pulled. Completely. For every user, worldwide. [pause] Paul, what do you make of this whole thing? Paul Mason: [tsk] Man. [pause] So I've got some — [short pause] I've got some conspiracy theories brewing, I'm not gonna lie. [chuckle] But I should say upfront — [inhale] as someone who is [emphasis] not a vibecoder, Fable disappearing didn't really break anything for me. My workflow runs on Claude Code with Opus, and honestly? [short pause] Opus 4.8 is still really solid. The gap between what Fable could do and what I actually [emphasis] need day-to-day — it's not that wide. Tim Williams: Yeah, [inhale] same here. Zero practical impact on my workflow. [pause] But — [emphasis] but — politically, this thing is [exhale] absolutely fascinating. [pause] So here's my theory, and I've been thinking about this all week. [inhale] I think this is retaliation. Pure and simple. [pause] Anthropic refused to let the Pentagon use Claude for autonomous weapons and mass domestic surveillance back in February. The DoD designated them a quote-unquote 'supply chain risk.' A federal judge literally called it — [short pause] and I'm quoting here — 'classic First Amendment retaliation.' [pause] So now, three days after Fable launches? The same administration drops an export control directive based on an [emphasis] unwritten, verbally described jailbreak from a third party? [exhale] Come on. Paul Mason: Yeah. [pause] That timeline is... [short pause] it's hard to look at that and not see a connection. [inhale] And what makes it worse — [tsk] the third party that reported the jailbreak? It was [emphasis] Amazon. [pause] Who is, you know — [chuckle] Anthropic's biggest cloud partner, multi-billion-dollar investor, [short pause] and also... a direct competitor building their own AI. [inhale] And instead of doing coordinated disclosure — you know, telling Anthropic first like you normally would — they went [emphasis] straight to the Commerce Department. Andy Jassy himself reportedly raised it with officials. Tim Williams: [exhale] Right. [pause] And here's the thing — the jailbreak itself? [inhale] It's asking Fable 5 to read a codebase and identify software vulnerabilities. That's it. [short pause] That's the [emphasis] entire demonstration. [pause] Anthropic's response is: GPT-5.5 can do the same thing. Every frontier model can do this. Security engineers use this capability [emphasis] every day to find and patch vulnerabilities before attackers do. [inhale] So if the standard is 'a narrow jailbreak exists,' then [emphasis] every deployed model should be pulled. But only Fable got pulled. Paul Mason: Yeah. [short pause] The double standard is what makes it smell funny. [inhale] And the timing — [pause] Anthropic is literally [emphasis] in active litigation against the government over the supply chain blacklisting. Like, right now. [tsk] And then Defense Secretary Pete Hegseth tweets — [short pause] let me get this right — 'Three months ago, the Department of War kicked Anthropic out of our building forever. Every passing day proves why that was the right move.' [exhale] That's not the language of an administration that's just concerned about a jailbreak. Tim Williams: [sigh] Exactly. [inhale] And that's before we even get to the [emphasis] Reddit theories. [pause] So — I went down the rabbit hole on r/ClaudeAI and r/artificial this week. [chuckle] Paul, you would [emphasis] not believe some of these. Paul Mason: [laughing] Oh, I've been in those threads. [inhale] I've [emphasis] been in those threads. [pause] What's your favorite? Tim Williams: [laughing] Okay so — [inhale] the retaliation theory is by far the most upvoted one, and we've covered that. [pause] But there's this — [short pause] there's this wild sub-theory that Anthropic [emphasis] wanted this to happen. Like, they needed a reason to pull Mythos off the market, so they — [chuckle] manufactured or at least welcomed the government intervention. [pause] I don't buy it, but it's out there. Paul Mason: [tsk] I saw that one. [inhale] I think that's giving everyone way too much credit for coordination. [pause] But the one that [emphasis] actually holds water for me is the transparency trap theory. [short pause] Anthropic published this incredibly detailed system card — honestly, the most transparent safety documentation any lab has ever put out. They admitted upfront that [emphasis] perfect jailbreak resistance is impossible. [inhale] And that honesty? [pause] It gave the government the ammunition. You can't say 'our model might be jailbroken' and then be surprised when someone uses that as the justification to pull it. Tim Williams: [exhale] That's the [emphasis] tragic irony of this whole thing. [pause] If Anthropic had been less transparent — if they'd done what every other lab does and just said 'our model is safe, don't worry about it' — [short pause] there's no roadmap for the government to act on. [inhale] The system card was the evidence. [pause] And then you add the Pliny the Liberator system prompt leak — [chuckle] within forty-eight hours of launch, this pseudonymous researcher publishes Fable 5's [emphasis] entire hundred-twenty-thousand-character system prompt on GitHub. Paul Mason: [chuckle] Pliny. [pause] Of course it was Pliny. [inhale] That account is — [short pause] I mean, they've made a whole brand out of liberating system prompts. [pause] But here's the thing that got me — [inhale] there was also the [emphasis] secret sabotage controversy. Fable 5 was designed to [emphasis] silently limit its own capabilities when it detected a user was working on frontier AI development. No notification. No disclosure. It would just — [short pause] quietly get worse. [tsk] And Anthropic walked that back within twenty-four hours after the community [emphasis] erupted. Tim Williams: [sigh] Right. [inhale] And that — [pause] that was the launch week they were having [emphasis] before the government stepped in. [exhale] The 'secret sabotage' story was already dominating AI Twitter. Dario Amodei had just published an essay on June 10th — [short pause] literally [emphasis] one day after Fable launched — calling for the government to have legal authority to block unsafe AI deployments. He compared it to the FAA grounding unsafe aircraft. [pause] Two days later, the government used exactly that power against [emphasis] him. Paul Mason: [laughing] I mean — [pause] you can't write that. [inhale] Dario literally asked for the power that got his own model pulled. [short pause] And David Sacks — the former AI czar — he came out and said the administration asked Dario to [emphasis] fix the jailbreak or de-deploy the model, and Dario refused both. [pause] The administration's framing is: 'Anthropic spent years telling us Mythos is a cyber weapon. We found the guardrails don't work. Now they're saying it's not serious?' Tim Williams: [exhale] And that's the bind Anthropic is in. [pause] They marketed themselves as the safety company. They said Mythos was too dangerous to release. They built their whole brand on 'we are the responsible ones.' [inhale] And now when the government says 'okay, prove it — fix the vulnerability or pull the model' — [short pause] they're trapped. [pause] Fixing it means admitting there's something uniquely dangerous about Fable that [emphasis] doesn't exist in GPT-5.5 or Gemini. Refusing to fix it makes them look like they don't actually care about safety. Paul Mason: Yeah. [short pause] And that's the part where, [inhale] as a developer, I start getting [emphasis] actually worried about the precedent. [pause] Not because I need Fable specifically — I told you, Opus is fine for my workflow. [tsk] But if the government can pull [emphasis] any frontier model based on a verbal jailbreak claim from a competitor — [short pause] with no written technical justification, no public standard, no process — [inhale] what stops them from doing it to the next model? Or the one after that? Tim Williams: [pause] That's the real question. [inhale] And Satya Nadella — Microsoft's CEO — he published this essay on June 14th that didn't mention Fable by name, but [emphasis] everyone knew what he was talking about. [short pause] His opening line was: 'A frontier without an ecosystem is not stable.' [pause] His argument was basically — [inhale] if all of your AI capability lives in someone else's model, and that model can be turned off overnight by government action, [short pause] you've outsourced your entire cognitive infrastructure to a single point of failure. Paul Mason: That's [emphasis] exactly it. [inhale] And that's future-you-will-thank-present-you territory. [chuckle] If your production pipeline is hard-coded to a single model ID with no fallback — [pause] you just learned why that's a bad idea. [short pause] The companies that are fine right now are the ones whose AI capability lives in their own fine-tuned models, their own evaluation datasets, their own institutional knowledge. [inhale] Not in their Anthropic API key. Tim Williams: [pause] And here's the thing I keep coming back to. [inhale] Whether the jailbreak concern is genuine, pretextual, or some mix of both — [short pause] the process was [emphasis] broken. [pause] There was no published standard. No transparent review. No technical board making findings in public. [inhale] It was a verbal claim, from a company that is simultaneously Anthropic's biggest investor [emphasis] and a direct competitor, delivered to an administration that was already in active legal conflict with Anthropic. [exhale] That's not regulation. That's — [short pause] that's something else. Paul Mason: [pause] Yeah. [inhale] And the worst part? [tsk] The incentive structure this creates is [emphasis] exactly backwards. [short pause] Anthropic was the most transparent lab about their safety limitations — and they got punished for it. [pause] The next lab that's deciding how much to disclose in their system card? [inhale] They're gonna look at this and think — [short pause] 'maybe we keep that to ourselves.' [exhale] That's a disaster for AI safety. Tim Williams: [sigh] That's the moral of the story, isn't it? [pause] If transparency about AI limitations invites regulatory action while opacity does not — [short pause] the industry will [emphasis] respond accordingly. [inhale] And the result will be less public information about what these models can actually do. [pause] Which is the exact opposite of what every safety advocate has been pushing for. [exhale] Here's looking at you, Anthropic — you tried to do the right thing, and you got — [short pause] well, you got this. Paul Mason: [chuckle] Yeah. [pause] The road to hell, paved with good system cards. [short pause] Sorry, I had to. [laughing] Tim Williams: [exhale] And speaking of costs — [pause] you know what this whole Fable situation really brought into focus for me? [inhale] Token anxiety. [short pause] Like, real, visceral, [emphasis] watching-the-numbers-tick-up anxiety. Paul Mason: [chuckle] Oh man. [pause] I've been there. [inhale] Every time I watch that little counter in Claude Code climb past what I would have spent on lunch — [short pause] there's this part of my brain that just — [pause] can't stop doing the math. [exhale] Future you will [emphasis] definitely feel that one in the wallet. Tim Williams: [laughing] Right. [inhale] I've got the two hundred dollar a month plan, and it shows you — 'here's what you would have spent on API calls.' [pause] And I'm watching that number go past five hundred, six hundred, seven hundred dollars — [short pause] in a week — [emphasis] and I start sweating. [inhale] And the thing is — [pause] I know the economics don't make sense right now. [short pause] Everyone knows. These companies are burning venture capital to subsidize every query. [exhale] But the numbers they report — [pause] they're sort of imaginary. Paul Mason: [tsk] That's the part people don't get. [inhale] When Anthropic says Fable costs X dollars per million tokens — [short pause] that's list price. That's not what it actually costs them to run the inference. [pause] The real unit economics are — [short pause] uh, [chuckle] they're opaque, let's put it that way. [inhale] And I think that's what makes the anxiety worse. [short pause] You can't plan around numbers you don't actually know. Tim Williams: Exactly. [inhale] And here's what really keeps me up — [pause] the price of these frontier models keeps going [emphasis] up, not down. [short pause] Fable is more expensive than Sonnet. Opus 4.5 was a step change in pricing from Opus 3. [inhale] The models are getting better, but the cost curve is bending the [emphasis] wrong way. [short pause] Meanwhile, Paul, you and I and every developer I talk to — [inhale] we're all saying the same thing: the models we already have are good enough for most of what we do. Paul Mason: [emphasis] Yeah. [pause] That's exactly what I've been thinking. [inhale] I was on a project last week — standard CRUD API, some middleware, error handling boilerplate — and Claude Code with Sonnet handled it in like three prompts. [short pause] I didn't need Fable. [pause] I didn't need Opus. [chuckle] I barely needed Sonnet. [inhale] And I think there's this — [short pause] this disconnect between what the labs think we need and what we [emphasis] actually need. [pause] They're racing to build models that can do PhD-level research — [short pause] and I'm over here just trying to get my TypeScript types to stop yelling at me. Tim Williams: [laughing] Right. [inhale] And that tension — [pause] that's what makes the future feel so uncertain. [short pause] Because if the labs keep pushing toward more expensive models, and the venture capital eventually — [emphasis] eventually — runs out — [pause] what happens? [inhale] Do the prices stay high and the services become luxury goods for enterprise? [short pause] Or do they collapse and suddenly we can't access these models at all? [exhale] And in that gap — [pause] that's where open source walks in. [inhale] Have you been following what's happening with GLM 5.2 and DeepSeek V4 Pro? Paul Mason: [excited] Oh yeah. [inhale] I've been watching that. [pause] GLM 5.2 from Zhipu AI — [short pause] open source, beats GPT-5 and Claude Opus 4.5 on MMLU-Pro, matches on MATH 500 — [emphasis] and it runs on consumer hardware. [pause] Like, you can run this thing locally. [inhale] And DeepSeek V4 Pro — [short pause] I saw the benchmarks drop last week. It's trading blows with Opus 4.5 on reasoning tasks. [exhale] Open source is not just catching up anymore. It's — [pause] it's pulling even. Tim Williams: [emphasis] That's what I'm saying. [inhale] Opus 4.5 was supposed to be this — [pause] this step change, this game changer. And it [emphasis] was — the reasoning capabilities are genuinely impressive. [short pause] But then GLM 5.2 drops like three weeks later and matches or beats it on half the benchmarks, [emphasis] for free, running on your own machine. [pause] The moat that these frontier labs are trying to build — [inhale] it's getting filled in faster than they can dig it. [short pause] And that changes the economics of [emphasis] everything we just talked about. Paul Mason: Yeah. [pause] And I think — [inhale] for developers like us, that's actually the [emphasis] hopeful part of this whole anxiety conversation. [short pause] The proprietary models might get more expensive. The government might pull them on a Tuesday afternoon. [chuckle] But the open source models — [pause] they keep getting better, and cheaper, and [emphasis] nobody can take them away. Tim Williams: [exhale] So — [pause] here's the practical question that falls out of all this. [inhale] If the frontier models are getting more expensive, and the government can pull them on a Tuesday, and open source is closing the gap — [short pause] how are developers [emphasis] actually managing this right now? [pause] Like, what's the day-to-day strategy for staying productive without burning through your entire token budget by Wednesday? Paul Mason: Yeah. [pause] So — [inhale] I think the pattern that's emerging, and I've been [emphasis] deep into this — [short pause] is what Anthropic themselves are shipping in Claude Code. [pause] The `/model opus-plan` command. [inhale] You type that at the start of a session, and from that point forward — [short pause] Opus handles the [emphasis] thinking. Planning, architecture, analyzing the codebase, figuring out the approach. [pause] And then Sonnet does the [emphasis] doing. Writing the functions, applying the changes, running the commands. Paul Mason: [inhale] And here's the thing — [short pause] the numbers actually back this up. [pause] A typical coding session — about sixty percent of the tokens are spent on execution tasks that Sonnet can handle [emphasis] just as well as Opus. [short pause] Writing boilerplate, applying straightforward diffs, formatting output. [exhale] You don't need a PhD-level reasoning model for that. [chuckle] You just need it to not mess up the indentation. Tim Williams: [laughing] Right. [inhale] And what I find fascinating is — [pause] this pattern didn't start with Claude Code. [short pause] It's been bubbling up from the community for months. [inhale] Developers on OpenRouter, on OpenCode CLI — they're building these [emphasis] multi-model pipelines by hand. [pause] Plan with Opus or GPT-5.5 [short pause] Implement with a cheaper model — sometimes Sonnet, sometimes an open source model through an API. [exhale] Review and critique with something else entirely — Kimi, or a different model that catches things the implementer missed. Tim Williams: [inhale] It's like — [pause] you wouldn't use a Formula One car to go to the grocery store. [short pause] You [emphasis] could. It would work. [chuckle] But it's a spectacular waste of money. [pause] And we're finally seeing the tooling catch up to make that distinction practical. Paul Mason: [excited] Totally. [inhale] And this is where the open source models we were just talking about — [pause] they [emphasis] shine. [short pause] I was reading through the r/LocalLLaMA threads, and there's this really consistent pattern now. [inhale] Developers are using Opus or GPT for the planning phase — the architecture, the design decisions. [pause] Then they hand the plan to GLM 4.7 or DeepSeek V4 for the actual implementation. [short pause] And one comment that stuck with me — [pause] someone said GLM 4.7, quote, "give it a detailed plan and it follows it carefully and relentlessly." [exhale] That's [emphasis] exactly what you want from an implementation model. Paul Mason: [inhale] And then — [short pause] here's the clever part — [pause] some developers add a [emphasis] third model for review. [pause] Kimi K2 Thinking, for example. [short pause] It reads through the implementation, critiques the approach, catches edge cases the implementer missed. [exhale] You've got this three-stage pipeline — plan, build, review — [pause] and each stage uses the model that's [emphasis] best at that specific thing, not just the one model for everything. Tim Williams: [emphasis] That's the convergence. [inhale] That three-stage pipeline. [pause] And look at the economics of it. [short pause] A typical complex build — [inhale] maybe four million tokens spent on planning, using Opus or GPT. [pause] Ten million tokens on execution. [short pause] If you run those ten million execution tokens through Haiku instead of Opus — [pause] you save fifty-seven percent right there. [inhale] If you run them through an open source model running locally — [emphasis] you save nearly everything on execution. [short pause] You're just paying for the planning phase. Tim Williams: [exhale] And that changes the entire calculus of — [pause] is it even worth paying for these services? [inhale] If you're only using the expensive model for ten or twenty percent of your tokens — [short pause] suddenly that two-hundred-dollar monthly plan stretches [emphasis] much further. [pause] The anxiety doesn't go away, but — [short pause] it becomes manageable. Paul Mason: Yeah. [pause] And I think — [inhale] this is where the token anxiety we were talking about earlier — [short pause] it's actually [emphasis] fueling something useful. [pause] Developers are scared of the bill. They're watching that counter tick up. [chuckle] And that fear is driving them to build [emphasis] smarter workflows. [inhale] The plan-with-the-smart-model, build-with-the-cheap-model pattern — [short pause] that's a direct response to watching your token budget evaporate before lunch. Paul Mason: But — [pause] and there's always a but — [short pause] this isn't solved yet. [inhale] The tooling is still [emphasis] really immature. [pause] Switching models mid-session, managing multiple contexts, keeping the plan consistent across different models — [short pause] it's clunky. [exhale] We're in the — [chuckle] we're in the "it works in my terminal" phase of this, not the "it just works" phase. Tim Williams: [emphasis] Exactly. [inhale] And that's the honest state of this. [pause] The convergence pattern is [emphasis] real — plan with frontier, implement with open source or cheap, review with something different. [short pause] You can see it emerging across hundreds of developers, across tools, across models. [pause] But the experience is still — [inhale] it's still duct tape and bash scripts for a lot of people. [exhale] The platforms that make this [emphasis] seamless — [short pause] they're coming, but they're not here yet. Tim Williams: [pause] And I think — [inhale] that brings us back to where this whole conversation started. [short pause] The Fable pull. The token anxiety. The open source catching up. [pause] All of these threads are pushing toward the same conclusion — [inhale] the future of AI-assisted development is not going to be [emphasis] one model to rule them all. [short pause] It's going to be — [pause] the right model for the right job, at the right price. [exhale] And the developers who figure out that pipeline first — [pause] they're the ones who are going to be [emphasis] terrifyingly productive while everyone else is still burning Opus tokens on boilerplate. Paul Mason: That's — [inhale] that's honestly the moral of this whole episode. [short pause] The Fable situation showed us the proprietary stack is fragile. The token anxiety showed us the economics don't work at scale. [pause] But the convergence — [inhale] plan smart, build cheap, review different — [short pause] that's the pattern that's going to carry us through. [exhale] And it's [emphasis] not fully baked yet, but you can see the shape of it. Tim Williams: [laughing] That's the episode. [inhale] Fable gets pulled. Tokens get expensive. Open source closes the gap. [pause] And the developers who survive are the ones who learn to put the Formula One car in the garage and take the reliable sedan to the grocery store. [short pause] Thanks for listening, everybody. We'll be back in next week. [pause] Here's looking at you.

Related Projects

GTZenda

Enterprise document intelligence pipeline that ingests procurement data from AI agents, classifies and normalizes documents using LLM processing, and pushes structured data into a government sales intelligence platform. Built on AWS with SQS-driven async processing and OpenAI integration.

Lead DeveloperView project ->

Therapy Buddy

Therapy Buddy is a cutting edge AI assisted special therapy application for patients and therapists to collaborate with specialized therapy sessions.

Solo DeveloperView project ->

eRepublic Events Portal

The eRepublic Events Portal is a platform that allow a deeply integrated experience for eRepublic event attendees and sponsors.

Senior Web DeveloperView project ->

Government Navigator

Government Navigator is a go-to-market sales and marketing intelligence platform tailored for state, local, and education IT vendors. By leveraging millions of data signals and decades of procurement expertise, it delivers real-time insights from early buyer-intent and pre-RFP alerts to verified contacts, jurisdictional profiles, statewide IT contracts, and curated market briefings so clients can uncover emerging opportunities and focus on winning deals instead of doing the homework.

Lead DeveloperView project ->

Episode Details

Published

June 18, 2026

Duration

38:16

Episode

S1E19

Technologies Discussed

*Claude Code

AWS

View All Episodes

Next Episode

AI Didn't Invent These Problems

Episode 1 of 18