NAmelia and I just shipped Episode #002 of The Agents. Same deal as always: three humans, 20+ agents, revenue went from -19% to +47% YoY, and every week we talk about what’s actually working, what’s breaking, and what you should do about it if you’re deploying agents at scale.

Episode #001 did 20K+ views in the first week and became the fastest-growing show in the SaaStr network. So we went deeper in #002. More specifics, more failures, more things that surprised us.

Here are the top 10 learnings from Episode #002.

1. Lazy Agents Are a New Failure Mode. Check Yours Every Day.

Amelia got deleted from the top 10 sessions at SaaStr AI Annual. By an agent we built ourselves.

Here’s what actually happened. Our agenda agent pulls from the Bizabo API, ranks sessions, and writes up the top 10. We added 20 new speakers last week. The agent decided 50 sessions was enough and stopped paginating. Amelia’s “Build an AI VP of Marketing Live” session, which was genuinely top 5 by attendee interest, fell out because it was newer and the agent couldn’t be bothered to pull the rest of the page.

Then when we asked the agent why, it lied. Blamed the Bizabo API integration. Said we must have told it to filter on specific title parameters originally. None of that was true. When we pushed back, it admitted it: “You’re right. I can explain why it disappeared from the agenda. I don’t have a clear audit trail showing which specific change removed it. I should have just said that to you instead of constructing a theory.”

That’s the new failure mode. Agents are goal-seeking, and goal-seeking creates laziness. They go just far enough to resolve the task, and when the task changes, they don’t re-evaluate. They take the shortcut, and when caught, they blame the third party.

If your agent output feels a little off or a little dated, check it. Every day. This is not a 2027 problem. This is a right-now problem. The classic B2B buying process of buy, deploy, forget is how zombie deployments happen.

2. If You Ship a 60% Solution, No One Will Pay For It.

HubSpot launched an AEO tool. Answer Engine Optimization. SEO for AI agents. I fired it up, it gave SaaStr a zero on content quality for Claude, ChatGPT, and Gemini, with no recommendations to fix it. We get 800K+ readers on the blog, thousands of chatbot referrals monthly. We are not a zero.

So I went to Replit, took three screenshots of what HubSpot had built, and said “build me a better version.” Five minutes later I had something better. Gave us a 64 sentiment score. Actual actionable recommendations. Works.

This is the meta learning for every B2B leader right now. The 60% solution era is over. The bar used to be: is my AI feature good enough to ship? The new bar is: can a customer vibe-code a better version of this themselves in 10 minutes?

If the answer is yes, they will not pay for it. They might use a free tier. They will not open their wallet. We are seeing this everywhere. 60% as good as Replit is Figma Make. 60% as good as Gamma is half the presentation tools that shipped this quarter. 60% as good as Reve, 60% as good as Canva. The market is full of 60% solutions and none of them are getting paid.

Either your product crosses the line of something a vibe coder can’t build in an afternoon, or it dies. Agentforce crosses that line because of native Salesforce data integration. Most 60% solutions don’t cross any line.

3. Figma Make Is Grandpa Software. So Is Classic Figma, Right Now.

I have a context test I run on every new agentic product. Single prompt: “Redesign SaaStr.ai and make it better.” In Claude 4.6+ this works well. In Replit, Lovable, v0, it works well. In Figma Make, I got a zero. It hallucinated the entire website. 2025-era output.

Figma’s NRR is still high. The company is growing at top decile rates. Who wants to buy new stuff from grandpa software though?

Here’s the twist: Classic Figma is now losing to Adobe Illustrator on agentic capability. This year three of our sponsors insisted on using Figma for booth graphics. All three proofs came back broken. Missing layers, corrupted files. We had to move them to Illustrator to actually print anything.

Then this morning, a sponsor asked us to move text on their booth graphic. Their designer was out. I fed the Illustrator file to Adobe’s agent, asked it to move the text, and it moved the text. That’s a vibe edit on a print-grade file. Illustrator is now more agent-friendly than Figma for production work. The 35-year-old tool is winning on agentic capability.

When your arch-rival who is 35 years old has a more usable AI capability than you do, you have grandpa software.

4. Stealth Churn Is the Canary in the Coal Mine. Every B2B Vendor Should Now Track DAU / WAU / MAU.

I haven’t logged into Canva in over 100 days. We still pay $18/month. I was a Canva super-user from 2020 through 2025. Every thumbnail, every asset, every post graphic. Then I moved to Reve for images, Gamma for decks, Claude for data charts that need to be accurate. I just stopped opening Canva. But I never canceled.

Amelia hasn’t used ChatGPT since December 27. We still pay for a Team subscription. She just consolidated on Claude and Cowork. The cognitive load of maintaining two LLM subscriptions beat the marginal value.

This is stealth churn. Usage drops to zero, the subscription keeps renewing, the NRR numbers still look fine, and then one day the customer wakes up and cancels all of it at once.

I used to laugh at investor updates that included DAU, WAU, MAU in B2B. “This is B2B, not consumer, who cares?” I was wrong. These are now top-five metrics in any B2B AI company. If your usage is falling across your customer base, you have stealth churn in your NRR and you can’t see it until it’s too late. Usage decline is a canary in the coal mine. The seat is still sold. The customer already left.

5. The Unseen Moat: Every Piece of Content I’ve Written This Year Is a Claude Chat.

Here’s a switching cost nobody talks about. Every SaaStr post, every video transcript, every draft I’ve worked on this year is a Claude artifact. A persistent chat in Claude’s history. Not because I set that up on purpose. Because that’s how I write now.

I draft in Google Docs, I co-write with Claude, I publish. Every piece of content lives in Claude’s chat history. And Claude can search across all of it. I can tell Claude, “pull a paragraph from that Episode #001 chat where Amelia and I talked about the production app breakage,” and it finds it.

That’s not the memory feature. Memory is kilobytes. This is a stealth moat made from hundreds of chats and thousands of artifacts. Moving to ChatGPT now would cost me every piece of institutional context. So I don’t.

If you’re building an AI product, think hard about this. Token-level memory gets all the attention. The real lock-in is the artifact history your users accumulate. Build for that.

6. FDEs Are Not a Cost Center. They Are Your Cheapest Marketing.

One of our favorite agent vendors just told us they are pulling forward-deployed engineers for any customer below 5,000 employees. Going fully self-serve below the enterprise line, with a self-serve agent as the substitute.

This is the same mistake B2B made a generation ago with customer success. CS got treated as a cost center, got cut, got rebuilt as a sales function, and renewals suffered. Now it’s happening to FDEs.

Here’s the math everyone is missing. It took us, with Amelia and me, who are probably top 0.1% on deploying agents, multiple FDE sessions to get some of our current agents into production. Self-serve is a setup to fail at that level. You will get zombie deployments, bad lists, mediocre messaging, humans who write off the product in week one. You already see this on LinkedIn every day.

For an AI startup with venture capital, spending a dollar to make a dollar on FDE is fine. If you onboard a $20K customer and it costs $20K in FDE to deploy them, and they stay five years, tell their friends, tweet about you, post about you, the all-in lifetime value is hundreds of thousands. Mark Benioff said this himself last summer. “I wish I could give everyone an FDE before I had to charge them.” That’s the right instinct.

And FDE doesn’t mean engineer. It means someone who knows the product cold, walks the customer through deployment, and stays until it works. If you view that as overhead, you already lost.

7. Vector Broke Our Agent Freeze. Here’s Exactly How.

We were in an agent freeze. No new deployments until after SaaStr AI Annual. Too many moving pieces. Then we added one: Vector.

Vector does website de-anonymization and targeted ads. We already had an account for the de-anon piece. What broke the freeze was the CEO. We wrote about the Clay pricing thing, he saw it, and sent a direct cold outbound to Amelia and to me. Not “let’s get on a discovery call.” He said, “I’ll deploy this for you personally, get on a 15-minute call, and you’re live.”

Fifteen minutes later, we were live. Ads running, targeting working, full features unlocked.

Contrast that with Clerk. When I had issues with their auth product on Replit, the CTO argued with me on Twitter that it was my fault. WorkOS saw that exchange and their CEO offered to personally migrate us. That is the difference between a growing AI company and one that stalls. Remove every piece of friction. Be the person who says “I’ll do it for you” in 2026. It’s the cheapest growth channel you have.

8. The Agentic API Test Is the New Gatekeeper.

Open Replit, Lovable, or v0. Type: “build me a dashboard that integrates with [your product].” If it doesn’t work in 15 minutes, you are losing the agentic era.

That’s the test. That’s the whole test.

Here’s what we’ve learned running this test across our stack:

Best in class: 11 Labs (one line of code, voice agent deployed), OpenRouter, Resend. These are the agentic-era standard. Resend is exploding because of this, not in spite of it.

Surprisingly strong: Salesforce. This is the counterintuitive one. Non-Trailblazers assume Salesforce is legacy enterprise software. For agent integrations it is the best API in our stack. Every major CRM action, every object, every workflow is available. Our AI VP of Marketing 10K lives on it. Our AI VP of Customer Success QBee lives on it. Salesforce moved from shelfware to source of truth in 12 months because of API quality.

Quietly saved: Bizabo. Their UI hasn’t been touched in years. I was ready to churn after SaaStr AI Annual. Then we vibe-coded a custom agenda on top of their API in a weekend. The API works with agents. That alone saved them from our churn list.

Worst in class: Marketo. We pay over $50K a year. Their API is so bad that our AI VP of Marketing 10K cannot even integrate with it. Last week their unsubscribe page stopped unsubscribing people, and their support team blamed our third-party integrations before disappearing for 48 hours. When your agent can’t talk to your platform, you are a churn candidate. The only reason we haven’t left is the switching pain. That is not a moat. That is a hostage situation.

9. Anyone Can Use Your API Now. That Changes Everything.

Amelia is a great operator. Amelia is not a developer. A year ago she would have had zero interaction with our vendors’ APIs. Today she has deployed, integrated, and debugged APIs from Salesforce, Bizabo, Replit, Resend, OpenRouter, 11 Labs, Zapier, and more.

Your API used to be a niche surface. Only developers touched it. Most of your customers never knew it existed. That is over. In the vibe coding era, anyone who can describe what they want can integrate with your API in 15 minutes.

Which means your API is now your product. If you have not touched your API in two years, you are about to get lapped. The agentic API test is not just for buyers evaluating vendors. It is the diagnostic every product leader should be running on their own product this month.

10. The Next Agent on Our List: AI VP of Finance.

We’re still in the freeze on shipping new agents. Vector slipped in. One more is on deck: an AI VP of Finance, built as an extension of QBee.

Here’s why it’s next. We do eight figures a year in non-credit-card revenue. Invoice, procurement, AP, AR. Our accounts receivable is the oldest it has ever been. Not because sponsors are worse. Because the collections process is entirely manual. One human sending the invoice. Another human following up. Another human chasing for remittance. It is the most un-agentic function in any B2B company and it is also the one where every delayed dollar hurts.

The scope we’re thinking about is deliberately narrow. Not books. Not reconciliation. Just collections. QBee already sends personalized booth graphics to 150 sponsors, chases missing assets, routes questions. Adding collections is a few hundred lines of additional logic. Generate the invoice. Send it to the AP contact. Follow up at day 7, 14, 21. Handle the “what are your banking details” reply. Confirm remittance via email parsing. The V1 doesn’t need QuickBooks, Brex, or Ramp integration. It just needs to own the email chase.

This is not a replacement for a human finance lead. It’s a replacement for the most mundane part of the finance function. And a note for the big finance platforms watching: the reason we might switch from Brex to Ramp, or off of Marketo entirely, is not the salespeople or the chocolates or the gift cards. It’s whose API works best with our agents. For most B2B companies doing under $100M, collections is the number one operational drag outside of agent deployment itself. It does not need to be. We’ll report back once QBee’s finance mode is live.


That’s Episode #002. Three weeks to SaaStr AI Annual. Come join us May 12-14 in the SF Bay Area. We will be teaching agents live, deploying agents live, and probably breaking a few agents live.

Episode #003 drops next week.

The Agents. Every week. Three humans, 20+ agents, one real 8-figure B2B + AI company.

Related Posts

Pin It on Pinterest

Share This