Harness Engineering: The Simple Skill Behind Useful AI Agents

Why the next AI gold rush is not just building agents, but making them work

Apr 01, 2026

∙ Paid

Everybody is chasing the shiny part of AI.

A better model. A bigger benchmark. A smarter demo. A faster release.

But something more important happened quietly in the background. OpenAI described building and shipping an internal beta with 0 lines of manually written code over about five months, with Codex writing everything from application logic to tests and docs, and they estimated it was built in about one-tenth the time hand-coding would have taken. Around the same time, LangChain said it pushed its coding agent from 52.8 to 66.5 on Terminal Bench 2.0 without changing the model at all. It changed the harness. Terminal Bench 2.0 itself is a hard benchmark of 89 terminal tasks, and the paper says frontier agents still score below 65% on it.

That is the real story.

Not just AI.
Not just agents.
The system around the agent.

That system now has a name: harness engineering. OpenAI uses the term directly, and LangChain is using it too.

My simple view is this:

Prompting was the warm-up. Harness engineering is where useful AI starts.

We spent too long worshipping the brain

For two years, most people treated AI like the whole game was inside the model.

Which model is smarter?
Which one reasons better?
Which one writes cleaner code?
Which one wins the benchmark?

That thinking is now too small.

A model can reason. It can generate text. It can write code. But by itself it still cannot reliably manage state, work safely inside boundaries, verify its own output, or handle long-running work across tools and environments. That is why OpenAI keeps talking about repository knowledge, architecture, constraints, and feedback loops. That is also why LangChain focused on traces, self-verification, and middleware instead of swapping models.

So the question is no longer only:

How smart is the model?

The better question is:

What kind of system is the model working inside?

Unlock this post to get the full breakdown of harness engineering, exact prompts, starter commands, project setup, agent rules, verification loops, money-making use cases, and the clearest path to turning AI from a toy into useful work.

Continue reading this post for free, courtesy of Opinion AI.

Or purchase a paid subscription.