The Replit team came out in force to SaaStr AI 2026 (from the CEO to the CRO and the President!). They also in an unguarded moment interviewed me on stage and off stage about how we’re actually running SaaStr AI with our agents. The real, unglamorous, in-the-weeds operational reality of running a B2B media + community + VC investing company with 3 humans and 21+ AI agents.
Here’s the deep dive on everything we covered. The real, actual playbook.
The Energy Shift From SaaStr AI 2025 to SaaStr AI2026
Last year was a transition year. Half AI, half old-school B2B. The energy on the AI side was electric but early — we’d built one agent (a Delphi digital clone) and that was about it. The other half of the audience was me talking to the CEOs of Dropbox and Calendly. I love those guys, but it was the past.
Amelia and I debated whether to even do the event again. The answer came when we got deep into Replit ourselves and the agents started compounding. We had more value to add to the world. We banned every talk about history. We banned discussion of AI in 2028. No singularity. No post-AGI. No intro-to-LLM sessions. The brief to every speaker: assume the average attendee is at $5M ARR, growing fast, has an AI feature that isn’t great, and is here to figure out their agentic strategy today.
The energy in the room felt like 2015 again. That’s a market signal.
If you’re a builder right now and you’re not at least invigorated, you should exit the industry. There’s never been a moment like this in tech, and there won’t be one like it again. The next 18 to 24 months will be the best of our careers. In two years, swarms of agents will be commonplace. The line between software and “just talking to an agent” will blur. We are not heading toward the democratization of software. We’re heading toward post-software, where agents invisibly spin up whatever they need behind the scenes and you never know it happened.
Story Points Are Dead. Compounding Velocity Is the Only Metric.
Pre-AI, you could measure any B2B startup by story points. Whatever flavor of story points you used didn’t matter — the discipline of measuring engineering output and driving it up quarter over quarter was the leading indicator.
I’d sit in board meetings and see CTOs who said “I don’t believe in story points, I believe in best efforts.” Those companies failed. The ones that measured story points but couldn’t grow them also failed. The winners showed story points compounding faster than headcount, quarter after quarter.
That was the pre-AI world. Linear correlation. Best teams pulled away. Solid playbook for a decade.
The compounding only started in January, February, March of 2026 — after Claude 4.5 and 4.7. Before December 2025, most of mainstream tech wasn’t taking agentic engineering seriously. The early teams (including Replit) were already showing radical productivity gains, but the broader market hadn’t woken up. Even the most agile teams are pulling ahead exponentially today, and the gap is widening every week.
If your CTO can’t show you radical, compounding productivity gains right now, you’re going to lose in the market. The level of competition is unprecedented. We’ve never had competition like this. Replit is in one of the most competitive spaces in tech, and the only way to survive in that kind of arena is to amplify human output by 10x or 100x with agents.
The Replit/SaaStr 1:1 Correlation: Internal Tool Fluency Predicts Sales Performance
Kody, who runs sales at Replit, pulled a fascinating dataset. He correlated internal Replit usage by each rep against quota attainment. The correlation was 1:1. The reps using the product themselves were hitting quota. The reps who weren’t, weren’t.
The mechanism is simple and underdiscussed.
When the product changes weekly, the rep who was building on Replit that morning can answer a customer’s question that afternoon. The rep who hasn’t touched the product in two weeks is giving stupid answers, because the version of the product they’re describing doesn’t exist anymore.
I told Cody he’s the best salesperson I’ve ever encountered in this category because he does something almost no rep does: he tells you why. Why something works. Why something doesn’t. Why something might change next quarter. Why a specific integration is weaker than another. In a world where software didn’t change for 10 years, you didn’t need a sales rep to explain why. You just needed them to take the order. In a world where the product is changing weekly, “why” is the entire sales motion.
Most reps still say “let me check on that.” The best ones say: “honestly, the Snowflake integration won’t be at parity with Databricks for at least two quarters, here’s why architecturally, and here are the 10 customers who’ve made it work anyway.” That answer wins the deal.
Every B2B sales leader should be measuring product fluency on their team as rigorously as they measure pipeline. It’s the new leading indicator.
The Bloomberg Beta Email: When 10K Wrote Something No Human Could
This is the story I tried to tell Amjad on stage but didn’t have enough time to land. It’s the most important agent moment we’ve had this year.
A few weeks before SaaStr AI Annual, I noticed VC attendance was light relative to overall attendance, which was tracking at 143% of prior year. Something was off. I asked 10K, our autonomous VP of Marketing (built on Replit, 14,230 lines of code, runs our Monday standups, sends our campaigns).
The agent’s first reply was deflective. “Oh no, VC attendance is great.” I pushed back. “Try harder.” Agent came back: “You’re right, it’s light. Let me dig in.” Pulled the numbers — 152 VCs registered vs the prior year’s count. Material gap.
I said: write the world’s best email to these VCs to come. Start with Bloomberg Beta — they were an early Replit investor, let me see what you can do.
What 10K produced wasn’t just a great email. It was something no human could have built in less than 8 hours of research. The agent assembled every adjacent investor attending, every Replit competitor attending, every portfolio company of Bloomberg Beta attending, and constructed a “how could you not be here, James?” argument that was unanswerable. Then I asked it to do the same for every CEO of every company I’ve invested in. It went through all 8,000 attendees and built personalized matching for each one.
It destroyed our matching software. Destroyed it.
I sent one of the outputs to a founder I’d invested in. The email uncovered that 8 of his top competitor’s entire management team were attending, plus every adjacent player in his market. Then 10K reasoned through whether it was worth his time given that vertical SaaS was lightly attended this year. He wrote back: “That’s the best marketing email I’ve ever received.”
The prose wasn’t magic. The context was magic. The agent had accumulated months of API connections, chat history, prior campaign feedback, and our entire attendee database. My prompt was lazy: “write a great email to Bloomberg Beta.” Three months ago that prompt would have produced garbage. It produces output today that took a team of researchers a full day to match.
Each email took 2-3 minutes of compute. That’s a lot. It’s also irrelevant when the output is this good.
My Theory of Why Our Agents Are Better: We Inadvertently Trained Them
Cody and the Replit team had a theory about why we’re getting outsized output from our agents: I’m good at defining end states and giving rich context, and Amelia is even better. That’s partially true. But it’s not the real reason.
My theory is different.
We’ve been running these agents for months. Every time 10K writes something great, I tell it that — “this is the best email we’ve ever sent, do another one like this.” Every time it writes something terrible, I tell it that too, sometimes in all caps. Over hundreds and thousands of these interactions, something is happening in the memory layer, in the context window, in whatever Replit is doing under the hood. The agent is learning what “good” means for us on a per-action basis.
We’re not prompt engineering. We’re inadvertently labeling training data, one interaction at a time. In data science, this is called labeling. Every “yes good / no bad” you give an agent is a label. Do it 100 times on the same type of task and the agent’s output on that task crosses a threshold.
That’s the moat. A one-sentence prompt today produces output that took 50 prompts to produce three months ago. The accumulated positive and negative reinforcement has compounded inside the session memory.
This has a profound implication for how teams should build agentic workflows. Most teams turn the agent on and let it run. Instead, you should spend 30 days reviewing every output before it goes out the door, marking yes/no, and fixing not the individual output but the process that produces the outputs. By day 30 you have something that reliably hits 80% of what your best person produces. By day 60, a new model drops and you’re at 90%. The teams who skip the labeling phase never get there.
QBee, 10K, and the Internal Agent Stack
A quick map of what we’ve built and what each agent does.
10K is our autonomous VP of Marketing. Built on Replit. 14,230 lines of code. Sends campaigns autonomously. Runs our Monday standups. Pulls our metrics. Writes the kind of emails described above.
QBee is our autonomous VP of Customer Success. Built by Amelia. Manages 100+ sponsors. Named for the daily QBR cadence — quarterly business reviews delivered daily, not quarterly, because the agent can do them every morning. We’re running customer success at 70% fewer human hours than a comparable B2B media operation. QBee surfaces churn risks, expansion opportunities, sponsor sentiment, and sends alerts only when something needs human attention.
The full GTM stack is 10K and QBee plus Qualified for inbound, Artisan for outbound, Agentforce for re-engagement, Momentum for revenue intel, and Monaco for deal coordination. We’ve closed $1M+ in revenue from AI-qualified leads and run 72% open rates on Agentforce win-back campaigns.
The lean math: 3 humans, 20+ agents, $1M+ in agent-sourced pipeline, 70% fewer CS hours than benchmark.
The N=1 App Strategy: Don’t Replace Your Stack. Build On Top of It.
The Replit team and I disagreed on something live in the conversation, and it’s worth sharing because it’s where most B2B operators get this wrong.
I said: don’t rebuild any SaaS that already exists. If a great tool exists, buy it. Don’t waste your team’s time replicating Salesforce or HubSpot or whatever.
Cody pushed back. Replit’s internal teams have someone on every functional team tasked with asking “can we build this ourselves before we buy it?” And that discipline forces the team to understand where the actual gaps in their stack are.
We’re both right. Here’s the synthesis.
Don’t rebuild great tools. Do build N=1 apps that sit on top of your stack and solve the specific gaps your existing tools don’t address.
Example: I run a headless Salesforce inside 10K. Not because Salesforce is bad — Salesforce is fine. But logging into Salesforce to find a dashboard that hasn’t been updated in 3 weeks is friction I won’t tolerate. So we built dashboards inside 10K that pull from Salesforce’s API and show me every sponsorship, every deal, every ticket, in real time. I have dashboards now, for the first time in my entire career as an entrepreneur, CEO, and founder.
Second example: Visible, our 15-year-old event CRM. Hasn’t shipped a feature since 2019. We were at the edge of churning. Then 10K interfaced with the Visible API and found that the documented API couldn’t do everything we needed — but the undocumented endpoints could. We’re renewing Visible and investing more in it. The UI is dated. The API works. That’s all that matters.
The playbook for any B2B operator:
- Sit down. List your top 5 daily headaches that your current stack can’t solve.
- Rank them by simplicity, not by impact.
- Ask Replit (or whatever agent builder you use) to build the simplest one first.
- You can’t lose. Worst case you’re out 20 minutes and 20 bucks. Best case you’ve automated a problem worth thousands of hours per year.
The parking pass example I tried to land with Amjad: someone on our team used to spend a week printing parking passes for 5,000 SaaStr attendees. Manually. Slicing a 5,000-page PDF, finding the right page, mailing the right one to the right person. The Replit agent did it in a few hours. Takes the PDF, slices it, matches each page to the right attendee, routes it correctly. A week of mind-numbing manual labor, gone. That’s an N=1 app no one is going to build for us as a product, because the use case is too specific. For SaaStr, it’s worth thousands of hours.
Social Selling Is Dead. Agentic Selling Is What Replaces It.
Cody mentioned someone on his team was building an agent to engage with LinkedIn comments on Amjad’s posts. I want to be careful here, because there’s a version of this that’s terrible and a version that’s transformative, and most teams build the terrible one.
Terrible version: someone comments “great post” on a Replit thread and gets a templated reply that says “thanks, would you like to try our product?” That’s not selling. That’s spam. It adds zero value to the relationship and trains the entire market to ignore your brand.
Transformative version: someone posts “I’m vibe-coding on Replit using the non-native Clerk integration, can’t get the private key to work, very frustrated.” An agent detects that they’re in your ICP, reads their post carefully, and responds with: “Yes, that specific Clerk integration has three known issues. Here’s exactly what’s happening and here are the three fixes that have worked for other builders. Happy to jump on a call if any of these don’t resolve it.”
That’s not 80% of a human. That’s 120% of a human. No human salesperson has the context, the time, or the technical depth to do that for every single ICP comment on every single platform every single day.
I keep saying agents can hit 80% of your best human at most sales tasks. That’s the bar for inbound, for outbound, for follow-ups. Social/agentic engagement is the one area where agents can exceed humans — because the volume and personalization required is beyond what any human can sustain.
Every vibe-coder on the planet has issues. There are millions of them. If your agent can credibly answer their questions in real time across every platform, you’ve built a sales motion no competitor can match.
This is the goal. I haven’t seen anyone do it yet at scale. The first B2B company to nail this owns their category.
Programming in English Is Real Now
Amjad has been saying “programming in English” for a while. Two months ago I would have said that’s CEO marketing. Today I think it’s literally true.
I’m building production apps on Replit with one-sentence prompts. I don’t design prompts anymore. I don’t write specs. I have enough context accumulated in the agent that a single sentence produces working software. That’s not best practice for someone new — they should write specs. Once you’ve put in the reps with a specific agent, English is the language.
Amelia is less technical than me. She knows some HTML. She hasn’t built what I’ve built on the software side. Because of APIs and agents, she can build anything. APIs until six months ago were for developers. Any operator with judgment and context can build on top of any API today.
This is the part of the shift that’s most underdiscussed. The democratization isn’t of software development. It’s of API consumption. Any data that was locked in any system can be liberated and reshaped by anyone with an agent and an API key.
A Few Other Things We Discussed
- The API Grader on SaaStr.ai grades B2B APIs on agent-friendliness across four criteria: performance, easy auth, stability, and documentation. Stripe is the only A+. Anthropic, OpenAI, and Gemini all get A or A-minus. Most legacy B2B companies score C or D. You can fix all four criteria in a week with a competent engineering team.
- Don’t give MCP too much credit. Focus on an agent-friendly API first. The Perplexity CTO just publicly turned his back on MCP and went back to API-first. MCP helps a human connect your app to Claude. MCP does nothing for an automated agentic workflow. Agents call APIs, not MCP servers, when they do real work. Build the API first. MCP is a nice-to-have on top.
- Twilio went from 4% growth to 20% growth in a year by becoming agent-friendly. Same product. Same core API. The change was making themselves trivially easy for agents to consume — native integrations, cleaner docs, easier auth.
- ElevenLabs crossed $500M ARR in less than 24 months is large part because their API is the easiest to integrate of any modern B2B tool. I integrated it on Replit in minutes for Founderscape.ai, the founder simulation game I built. That experience is the moat.
- Replit’s dogfooding model goes deeper than people realize. Every functional team has at least one or two people whose explicit role is to bring Replit into the team’s workflow before any off-the-shelf SaaS gets purchased. That mandate forces the team to understand where the real gaps are versus just buying more tools.
- Most teams turn agents on and walk away. The teams that win spend 30 days reviewing every single output, fixing the process, not the output, and labeling yes/no aggressively. By day 30 the agent is reliable. The teams that skip this never get to reliable.
- Both Amelia and I think the line between software and “talking to an agent” will fully blur in 6-12-18 months. Not the democratization of software. Post-software. Agents will spin up applications invisibly and you won’t know it happened.
