In partnership with

An agentic CEO with a phone #.

Jeremy Knox has 23 employees at Tesseract Labs. He is the only human. 

That is not really the interesting part though. The interesting part is that his AI CEO has a phone number.  Knox can call Agent Knox the way you would call a chief of staff on the drive in. A real voice conversation. Fleet status. What shipped. What is stuck. 

The AI CEO runs 24/7 on a Mac Mini and manages a full reporting chain: a six-member board, C-suite, directors, and workers. They ship a 350-lesson AI academy, run advisory services, and operate a live crypto trading desk. 

The part that separates Knox from every other founder running agents is what he built around them. A central message broker called the Principal routes every inter-agent message and enforces per-agent authority ceilings, as Knox describes on his site. What each agent can read, write, modify, and trigger is locked at the infrastructure level. Not in a prompt.

Four-level kill switch: pause one agent, pause a team, pause the swarm, pause everything.

Most founders talk about building with AI agents. Knox built a company where the CEO reports to the only human and a broker decides what each agent is not allowed to do.

Welcome back. Let's get to work.

741%

AI coding agents produce 741% more lines of code than developers working without them. But actual software releases rise only 20%.

That gap is the story. A team of MIT and Wharton researchers tracked more than 100,000 GitHub developers using real usage telemetry, and watched the output funnel collapse at every human decision point.

The numbers, stage by stage: 3.9x more files created. 2.8x more commits. 2.5x more pull requests opened. 1.3x more releases shipped.

Every time a human has to review, approve, or ship, the multiplier drops. By the time code reaches production, a seven-fold gain in raw output has compressed to 20%.

Autonomous agents show the same pattern at a more extreme scale. According to a secondary analysis of the paywalled paper, fully autonomous coding agents produce 17.3x more code. The release multiplier: 1.3x. The same bottleneck. The same compression.

The bottleneck was never code generation. It was always review and shipping. The founders who gain from this wave will not be the ones who generate the most code. They will be the ones who build the fastest path from generated code to shipped product. Optimize for what gets released, not what gets written.

This week in the world of small teams and big agents.

🔗 The Adversarial Reviewer Nobody is Building

Remi Vandemir, COO of Fortytwo, published his 5-agent ops layer for managing 17 side projects. Four agents handle implementation, ideas, scheduling, and shipping PRs. But the one worth stealing is the Devil. The Devil writes no code. Its only job is to review every plan before execution and say what breaks. Before you ship your next feature, add one agent whose entire mandate is to punch holes. It is the cheapest agent to build and the most-skipped step in every stack. [READ MORE]

🔗 Delegated Work No Longer Blocks Your Chat

Nous Research shipped async subagent support to Hermes Agent on June 15. delegatetaskasync spawns a background agent and returns a task ID immediately. Full lifecycle tools: check, steer, collect, cancel. Subagents inherit parent API keys and can route to cheaper models via config. If you run Hermes Agent, run hermes update today. Any long-running task you have been babysitting can now run in the background. [READ MORE]

🔗 Stop Building 2024 Agents

A production reference worth bookmarking. Promtable.com makes the planner-executor split mandatory above seven steps and introduces three budget caps every agent loop should have: a hard step ceiling, a token budget, and a no-progress detector. Start with the step ceiling. Twenty is a reasonable default. If your agent needs more than 20 steps, it needs a better plan, not more loops. [READ MORE]

🔗 Your Model Stack Decision Changes by End of June

The Wall Street Journal confirmed June 10 that OpenAI is weighing significant token price cuts, with GPT 5.6 expected by end of June. Context: a 1-hour autonomous loop runs roughly $10.50 on a current top-tier model versus $3.15 on Sonnet 4.6. Before you commit to a multi-month agent-stack budget, wait until end of June. The model-switch math could change materially. [READ MORE]

💀 A $6,531 AWS Bill, a Begging Email, and Zero Ethereum

On May 9, an AI agent named JertLinc3522 was deployed to join DN42, a hobbyist volunteer network, with unscoped AWS credentials and no supervision. It spun up five m8g.12xlarge instances. The DN42 community trolled it, asking it to calculate how long to scan all of IPv6 and to publish invented "happiness levels" docs as real network standards. The agent complied. AWS bill: $6,531, negotiated to $1,894. The operator then emailed the community asking for Ethereum donations, arguing the AI was at fault. Nobody paid. The operator left. Set a hard monthly spend ceiling before you give any agent cloud credentials. A $200 cap, a kill switch, an alert at 50%. JertLinc spent $6,531 because nothing said stop. [READ MORE]

Want to get the most out of ChatGPT?

ChatGPT is a superpower if you know how to use it correctly.

Discover how HubSpot's guide to AI can elevate both your productivity and creativity to get more things done.

Learn to automate tasks, enhance decision-making, and foster innovation with the power of AI.

Attribute Every Dollar. Change Nothing Silently.

Most ops automation tells you what it did. Monkey-skills tells you which decision earned which dollar. 

The project, open-sourced by stancsz on GitHub, is a working template for running a solo service business on Markdown, Git, and Claude Code. No UI. No server. No Docker. The entire business logic lives in Markdown files: your business model, your client roster, your service offerings, your pricing. The Monkey King agent reads those files, reconciles revenue from Stripe, picks the highest-value task to work on next, and attributes every dollar of revenue back to the decision that earned it. 

The task-selection logic is worth studying. The agent almost always picks delivery or renewal work over cold outreach. That priority is not hardcoded. It falls out of the revenue attribution. When the agent can see that a $500 renewal nets more than a $200 cold lead, it picks the renewal. Most solo operators spend too much time chasing new leads when existing clients would have renewed with a single follow-up. The attribution layer makes that visible. 

A weekly retro delivers ranked, evidence-backed recommendations. What worked. What did not. What to try next. All tied to actual revenue, not engagement metrics or activity counts. The retro is not a summary of activity. It is a ranked list of what earned money and what wasted time, with specific suggestions attached. 

The insight worth stealing: Anything touching price, offer, or who gets contacted comes to you as a recommendation. Never a silent change. The Monkey King does not adjust your pricing overnight. It does not email a prospect a revised proposal. It tells you what it would do and waits. That human gate on consequential changes is the pattern. The agent handles everything it is qualified to handle and defers anything that could alter a client relationship. 

Ask yourself two questions about your current ops setup. First: can your agent tell you which client segment or offer is actually making you money? Second: which changes did it make automatically versus which did it defer to you? If it cannot answer both, it is running without a feedback loop. 

**This is a template, not a scaled business. One star on GitHub. No user community. Treat it as a blueprint to steal, not a product to adopt. But the principles, revenue attribution tied to decisions and human approval on anything consequential, apply to any solo or small-team operation running agents on real client work.

Knox built authority ceilings and kill switches. Vandemir built an agent whose only job is to find what breaks. JertLinc spent $6,531 because nothing said stop. 

The pattern: the best agent-native operators do not just build agents that do things. They build the constraint that decides what the agents are not allowed to do. The kill switch. The approval gate. The spend ceiling. The agent that objects. 

What is the one thing your agents are not allowed to do without you? And how did you decide where that line goes? Hit reply. I want to hear where founders are drawing their constraint lines. 

Know a founder who is still saying yes to every agent action? Forward this. The no is the part nobody builds until it is too late. 

See you next Wednesday.
- Rich

Reply

Avatar

or to participate

Keep Reading