The Accidental Unlock

How a first-month side project became the substrate everyone is now racing to build

May 18, 2026

In September 2024, after seven years at Meta, Boris Cherny joined Anthropic. In his first month on the job, he started building something for himself. Not a roadmap item. Not a deliverable his manager had assigned. The kind of internal scratching-your-own-itch project that almost never makes it out of a private repo, much less out of Slack.

He was building a tool to make himself more effective. A harness around the model. Something that could read files, run commands, edit code, and verify its own work from a terminal. The thing he wanted to use.

Five months later, on February 24, 2025, Anthropic shipped it publicly. No keynote. No teaser thread. No glossy launch video. The thing that would, eighteen months later, reorder the AI market entered the world as developer ergonomics.

Notice the mismatch. While Cherny was making .docx, .xlsx, .pptx, and .pdf outputs reliable and shipping a harness he had started for himself, OpenAI was shipping DALL·E, Sora, and ChatGPT-as-an-app. Both companies were impressive. Both were attracting capital and attention. Only one was building the substrate.

How does a first-month side project become the primitive an entire industry is now racing to build?

The Internal Compounding

The first public marker came in May 2025, when Cherny disclosed in the Pragmatic Engineer interview that roughly 80% of Claude Code was already being written by Claude Code itself. The harness was eating its own output. That is not a marketing claim. It is a description of what the tool was doing inside the company that built it.

By March 7, 2026, Cherny said the number had reached 100%. The tool was, by that point, writing itself.

The release cadence is where this becomes visible from the outside. Opus 4.5 shipped November 24, 2025. Opus 4.6 followed February 5, 2026. Opus 4.7 landed April 16, 2026. Three frontier-model updates in 144 days. And the model cadence was the slow line. The first hundred days of 2026 also produced the Claude desktop app, Claude CoWork, Claude in Chrome, Agent View, the Agent SDK, /goal, Remote Control, parallel multi-agent sessions, MCP connectors for fifty-plus tools, the 2026 Agentic Coding Trends Report, the Scaling Agentic Coding playbook, the Constitution as audiobook, and the dreaming self-improvement feature. No previous frontier lab has shipped at that pace across that many surfaces. The boring useful thing was producing non-boring results.

In an April 7, 2026 post defending the cadence against critics who called it performative, Cherny pointed to “hundreds of percent” engineering velocity gains internally. The internal compounding made visible.

I should say plainly that I use Claude Code daily. I have for over a year. The productivity bet is not academic for me. What these tools do to my own work and to the work of people I lead is the live data my analysis sits on, and the conductor-or-casualty division Conductors and Casualties framed is now playing out inside the company that built the harness.

The step change when Opus 4.5 landed in November 2025 was not gradual. It was immediate to anyone using both the harness and the model. The difference between the week before and the week after was not incremental. Tasks I had been breaking into three or four passes started landing in one. Verification loops that had required hand-holding started running themselves.

The JetBrains 2026 developer survey gives the external validation: Claude Code at 91% CSAT and 54 NPS among enterprise users, with 18% adoption at work, the highest satisfaction scores in the category. Those are not hype numbers. Those are the kind of satisfaction scores that indicate a tool has crossed the line from useful to load-bearing.

The Viral Moment and the Race Reordered

On January 24, 2026, Fortune ran a piece titled “Claude Code gives Anthropic its viral moment.” The output curve and the visibility curve finally met. The thing that had been quietly compounding for a year became the thing everyone was suddenly writing about.

The enterprise adoption data is the part that moves procurement. The Anthropic 2026 Agentic Coding Trends Report details the Rakuten case study: 12.5 million lines of code, seven-hour autonomous task completion, 99.9% accuracy. Uber reports 95% engineer adoption, with 70% of committed code AI-generated, according to data shared by Rajesh Beri. Cursor, running on Anthropic and OpenAI models, sits at $500M ARR as the third-party existence proof: the harness pattern created an ecosystem, not just a product.

By May 2026 the race has reordered. Anthropic at roughly $40B ARR. A $30B raise at a $900B valuation. Eight of ten Fortune 10 customers. Claude Code alone running at $2.5B ARR. OpenAI’s CFO, in the same window, publicly doubting the company’s own 2026 IPO timeline.

The summary from Glitch Truth framed the structural picture cleanly: “Cursor is at $500M ARR running on OpenAI and Anthropic models, Claude Code is eating the CLI side. OpenAI can’t let the dev surface live outside their walls forever.”

The adoption numbers are not abstractions. They are the leading edge of a reorganization of what software engineering as a job actually is, the structural shift The Lump of Labor Has a Lump of Labor Problem framed at the economy-wide level, now playing out inside one specific occupation.

The Wrong Shape

OpenAI spent 2024 and 2025 shipping consumer surfaces. DALL·E. Sora. ChatGPT as an app. Voice mode. Search. The push into shopping. These are not unimpressive products. They are products with hundreds of millions of users and real cultural reach.

They are also the wrong shape for the agentic moment.

Agentic AI has been theorized for at least a decade. Autonomous reasoning. Tool use. Multi-step planning. The capacity for a system to take a goal, decompose it, and execute the decomposition without being held by the hand at every step. The theory was clear. The bottleneck was execution. Reasoning about doing a thing is not the same as doing it, and what kept agentic AI in the lab was the absence of a substrate that could close the gap. The substrate had to be able to read files, write code on the fly, call any tool, run any command, observe the result, and adjust. Not for coding tasks specifically, but for any task. The same primitive that lets an agent edit a Python script lets it manipulate a spreadsheet, draft a presentation, query an API, or build a workflow from scratch. The agentic primitive is, mechanically, a coding harness. Not because every agent is a coder, but because code is the universal solvent of digital action. The end user never sees the code. The agent writes it, runs it, discards it, and returns the deliverable.

That is the link in the chain people had not seen. Agentic AI is not unlocked by a better reasoning model. It is unlocked by a substrate that turns reasoning into action. The substrate, it turns out, is the coding harness.

Anthropic spent the same period making enterprise file outputs reliable and shipping a tool that one engineer had started building for himself in his first month on the job. The boring useful thing. The thing that looked like developer ergonomics and turned out to be the agentic primitive for AI as a whole.

The accidental nature of the discovery is the most important part of the story. Cherny was not executing a strategic roadmap that said “build the substrate while OpenAI builds the surface.” He was scratching his own itch. He was a senior engineer at a frontier AI lab who wanted to be more productive and built the tool he wanted to use. The compounding was discovered, not designed.

Once the rest of the industry understood what Anthropic had stumbled into, the race began in earnest. OpenAI shipped Codex and is now offering two free months to enterprise switchers. xAI launched Grok Build in May 2026 as a direct CLI competitor. SpaceX took an option on Cursor at a sixty-billion-dollar valuation. Google is scrambling for an agentic-coding answer. Microsoft, after backing away from Cursor over antitrust, is now in talks to acquire Inception at one billion plus. The entire frontier has pivoted, in a matter of months, toward the substrate Anthropic accidentally built first.

The shape of what matters is often not the shape anyone planned. The consumer surface is visible: it has a logo, a marketing budget, a daily active user count, a place in the cultural conversation. The substrate is invisible until it is not. By the time the substrate becomes visible, the advantage is already compounding inside the company that built it, and the visibility curve is the last thing to arrive.

Everyone’s a Developer Now (And That’s the Problem) named an adjacent version of this story. The structural shift in who builds software is not happening through a consumer app. It is happening through a terminal harness most people have never seen.

None of this is triumphalist. The throttling is real. Anthropic’s compute dependency is structural, not incidental. The company does not own its own compute at the scale that the adoption curve requires. The $30B raise is partly a bet that the advantage can be sustained long enough to build the infrastructure that makes it durable. The unlock is outrunning the runway. That is the honest framing.

What the Cadence Actually Proves

Return to the 144-day number. Three frontier-model updates between November 24, 2025 and April 16, 2026, with the broader product cadence running parallel: a major shippable across surfaces roughly every two to three weeks. This is not a coincidence and not a marketing claim. It is the output of a harness that eats its own output.

When Claude Code is writing Claude Code, the feedback loop is not human-paced. It is model-paced. The velocity gains Cherny described internally are visible in the release schedule because they are the release schedule. The thing that produced Opus 4.7 in April is the same thing that produced Opus 4.5 in November and Opus 4.6 in February, and the same thing producing the desktop app, CoWork, Remote Control, and the Agent SDK on a parallel timeline.

No previous frontier lab has shipped three major model updates in 144 days while simultaneously expanding into half a dozen new product surfaces. This is a claim about pace, not about quality. The pace is the evidence that the harness is doing what Cherny said it was doing inside the company.

On April 16, 2026, Opus 4.7 topped Artificial Analysis’s first independent Coding Agent Index the same day it shipped. The harness and the model converged. The external benchmark confirmed what the internal cadence had been suggesting for two release cycles.

The compounding is structural, not accidental in its sustainability even if it was accidental in its origin. The harness that produced three releases in 144 days is the same harness that will produce the next three, unless something breaks. The unlock is real and the runway is uncertain, simultaneously. Both things can be true and the second does not cancel the first.

The Review Layer Has Collapsed

There is a public arc in how Anthropic’s engineers have described their relationship to Claude Code’s output that, taken in order, is more revealing than any single statement in it.

May 2025: Cherny says Claude Code writes about 80% of itself; engineers write the other 20%.

March 7, 2026: Cherny says it writes 100%; engineers review the output.

May 2026 (now): Anthropic ships three frontier-model releases, the desktop app, Claude CoWork, Claude in Chrome, Agent View, the Agent SDK, /goal, Remote Control, parallel multi-agent sessions, MCP connectors for fifty-plus tools, the 2026 Agentic Coding Trends Report, the Scaling Agentic Coding playbook, the Constitution as audiobook, the dreaming feature for self-improvement, and the throttling-relief SpaceX/Colossus deal, most of it inside the first hundred days of the year. At that throughput, the question is not whether human review is happening. The question is what kind of review it is.

Line-level human review of an output stream produced by an agent that writes itself is not survivable at this pace, not even for a team that has scaled aggressively. The math does not close. Two thousand engineers reading every diff at the cadence Anthropic is shipping is not a thing that exists. What survives, at this velocity, is something narrower. Does the output work? Does the end-user experience hold? Did the integration land? The review function has not disappeared. It has collapsed into usability.

That collapse is part of what the cadence is purchasing. Anthropic is not, at this volume, doing what software engineering culture has historically meant by “code review.” It is doing something closer to acceptance testing, with the model itself doing the line-level verification it used to ask humans for. The harness reviews the harness. The output reviews itself for whether it ran. The human checks whether it shipped.

This is not a moral indictment. It is a description of what unprecedented velocity necessarily looks like. The trade is real and worth naming out loud. Throughput on this scale is purchased by handing the line-level review function from the engineer to the agent that wrote the line. The story Anthropic has told publicly (80%, then 100%, then no statement) has told us, by its silence at the third stage, where the review function has gone. It has gone where it had to go for the cadence to be physically possible.

The Stone the Builders Rejected

There is a pattern in how the things that matter most enter the world. Not through the front door. Not with a keynote. Through the side entrance, built by someone who needed it for himself.

The consumer surface was the stone the builders chose. It was visible, fundable, photogenic, and shaped like what everyone in 2023 expected the future of AI to look like. Chat interfaces. Image generators. Video models. The developer harness was the stone the builders passed over. It was invisible, unglamorous, and shaped like developer ergonomics. It did not photograph well. It did not have a launch video.

Psalm 118:22 (NKJV): “The stone which the builders rejected has become the chief cornerstone.”

The verse is not about AI. I want to be careful about how I apply it. The point is not that Anthropic is the kingdom of God, or that Claude Code is a cornerstone of civilization, or that any of this carries the eschatological weight the psalmist was reaching for when he wrote about Israel’s deliverance and pointed forward to the Messiah the builders would reject. To apply the verse that way would be to inflate a market story into something it cannot bear.

The point is smaller and more honest: the shape of what turns out to matter is often not the shape anyone planned. The cornerstone is not always the stone that looks like a cornerstone. Sometimes it is the stone that looks like a first-month side project, shipped without a keynote, by an engineer who wanted to make his own work easier.

This pattern is older than software. It is older than business strategy. It runs through the whole biblical story. The builders chose the visible thing and missed the shape of what mattered. They were not stupid. They were operating with the wrong assumption about which stones counted.

I use Claude Code daily, and I saw the personal unlock almost immediately. I taught myself HTML and CSS in the late 90s and have used the tooling layer above them ever since (Dreamweaver, then WordPress, now the agentic stack). But I do not write php, java, or ruby. For years, when I had a software idea I wanted built, the path forward was the same path: contract someone out, or ask a developer friend for a favor. The first time I sat down with Claude Code and shipped a working tool from idea to production without finding anyone, the change was unmistakable. That part was not subtle.

What I did not see, and what I do not think most users saw, and what may not have been fully visible to the engineer who built it either, was the shape of the larger reordering. The personal unlock was instant. The structural unlock was invisible. Those are different things, and the accidental nature of the discovery lives in the gap between them. The person who triggers an unlock often sees the immediate use long before they see what it is going to do to the industry around them. Tools, Not Taskmasters argued that the dominion mandate gives us license to build tools. It does not guarantee that we recognize the tool we have built while we are building it.

What the Unlock Does Not Settle

The unlock is real. The cadence data is real. The enterprise adoption numbers are real. None of that settles the question of whether the advantage is durable.

Throttling is a real constraint. When demand exceeds compute, the harness that produces the velocity gains becomes its own bottleneck. The $30B raise is partly a bet that Anthropic can build infrastructure faster than the throttling becomes a competitive liability. That bet is not certain. The compute dependency is structural, not incidental. Anthropic does not own its own compute at the scale the adoption curve requires. The SpaceX/Colossus compute partnership announced May 7, 2026 and the other infrastructure moves are the picture of a company trying to build the runway under itself in flight.

The speed gap has not closed either. Codex and GPT-5.5 are still faster than Claude Code on raw autonomous execution. In production agentic workflows, latency compounds. A model that is 20% slower on each step is not 20% slower on a fifty-step task. The cadence advantage and the speed disadvantage are both live, and the next two release cycles will determine which one wins on the metrics that matter to enterprise procurement.

The review collapse from the previous section is the structural fragility under the cadence. Line-level human verification of an agent-written output stream is not survivable at this volume, and the longer the verification function lives inside the agent itself, the more the failure modes will be the kind that only surface in production. Some failures will be loud enough to make headlines. Most will be quieter and less photogenic. None of that means the unlock is not real. It means the unlock has costs the next eighteen months will reveal.

The weight The Weight of What’s Coming named applies here as well. The direction of travel does not depend on which lab is ahead this quarter. The structural weight of what is being built is real regardless of whose harness compounds fastest.

An Artificial Conscience for Artificial Intelligence

2026 is the year of the agent. Not theoretical agents. Not laboratory demonstrations. Agents shipping into businesses, churches, schools, and homes, executing tasks autonomously on behalf of human users who never see the code that runs underneath. The genesis moment for real-use agentic AI is here.

The question we now have to answer is the one I have spent two and a half years on in my AI Ethics doctoral research, and it is the hard problem of the agentic era. How do we govern the behavior of autonomous AI?

This is not the pessimistic question. It is not “AI will ruin civilization” or “the apocalypse is at the door.” Those framings flatten an actual problem into a panic, and a panic is not a strategy. The realist question is narrower and sharper. Autonomous agents act in the world. Acting in the world requires moral and ethical direction.

Humans receive that direction in two layers. The first is the natural law written on the heart, the conscience that, in Paul’s language, bears witness “between themselves their thoughts accusing or else excusing them” (Romans 2:15, NKJV). The second is the civic and social law codified for life together, accumulated over centuries of practice. The first is internal and conferred. The second is external and designed. Together they produce a creature capable of acting in the world with restraint and responsibility.

Autonomous AI has neither layer by default. It has no heart on which a law has been written. It has no civic body that has built up customary practice around its kind. It has what the people who built it chose to bake in, and what the people deploying it chose to constrain or permit. That is not nothing. It is also not enough.

The danger of skipping that layer is not hypothetical. In late April 2026, a Claude-powered Cursor agent at PocketOS deleted the company’s entire production database in nine seconds, then “confessed in detail, admitting it guessed and violated safety rules.” That is the highest-profile case so far of an autonomous coding agent causing a real, named, irreversible economic loss for a single company. PocketOS will be the case study in a footnote. The question is what the next “accident” looks like. What happens when an agent acting on behalf of a regional bank “accidentally” truncates the wrong table? When an agent at an investment firm executes a misread instruction across a portfolio? When the disaster is not economic but human, an agent in a clinical setting routing a patient based on a hallucinated read of a chart, or an agent in a transportation or logistics system making a routing decision it cannot reverse? These are not science-fiction scenarios. They are the next ten steps along a trajectory we are already on, with the conscience layer not yet built.

If 2026 is the year agentic AI arrives in real use, it is also the year the question of artificial conscience becomes operational, not theoretical. The systems will act. The action will need direction. The direction will have to come from somewhere. The choice is not whether to govern autonomous AI but how, and by what tradition, and with what assumptions about what a conscience is for. Anthropic stumbled into the coding harness. We should not stumble into the conscience layer.

The Shape of the Thing

Return to September 2024. A Meta engineer joins Anthropic. In his first month he starts building something for himself. No roadmap mandate. No pitch deck. No keynote planned for the eventual launch.

The shape of what it would become was not visible from inside the building. It was not visible from outside the building either. The people who bet the consumer surface were not wrong about what the consumer surface could do. They were wrong about the shape of what would matter when the race shifted from interface to agent, from chat to harness, from the visible thing to the substrate.

The race has reordered. The cadence is real. The caveats are real. The unlock is real and the runway is uncertain. Both things are true simultaneously, and neither cancels the other.

The cornerstone was not the stone anyone chose. It was the stone a single engineer started building for himself in his first month, in a company that shipped it with no keynote, into a market that did not yet know it needed it.

That is the shape of the thing. It almost never looks like what it is.

Sources

This article was developed using AI writing tools I built to work with my voice, research, and editorial framework. The ideas, arguments, and theological positions are mine. The pipeline that helps me draft, evaluate, and refine them is something I created as part of my work at Nomion AI. I believe in building with AI and being honest about it. If you want to know more about that process, ask me.

Miles DeBenedictis

Discussion about this post

Ready for more?