How I Run an AI Agency Solo (No Employees, $40k MRR)

I run an agency that does $40k MRR

No employees. No contractors. No standups. No payroll. No "let me check with my team and get back to you"

Just me, a laptop, and a delivery system that does the work 5 people used to do

Total cost to run the whole thing: under $300/month

That's not a typo. The agency clears more than $39k of that $40k every single month, and I take home a margin that no staffed agency on earth can touch

For a long time I believed the same thing you probably believe right now: that scaling an agency means hiring. More clients, more people. More people, more management. More management, less actual work and thinner and thinner margins until you've built yourself a job you hate

I don't believe that anymore. The math changed and almost nobody noticed

This article is the full breakdown. The offer, the delivery engine, the exact model stack, the unit economics, the acquisition system, and a 90-day plan to your first $10k MRR. Every stage, how I actually solve it, no theory

After reading this, you'll have:

A complete blueprint for running a $40k MRR agency with ZERO employees, the exact offer, delivery system, and stack

An AI delivery engine that does the work a 5-person team used to do, at under $300/month in total cost

The unit economics that let a solo operator keep 90%+ margin instead of the 30% a staffed agency survives on

A client acquisition system that fills your pipeline without a sales team or an ad budget

A 90-day rollout plan to your first $10k MRR, with specific actions for each phase

Quick context so the numbers make sense: I run an AI automation agency. I build automations and AI systems for small businesses and lean teams. Lead routing, customer support agents, content pipelines, internal ops tooling, data workflows. Boring, high-value stuff that businesses gladly pay monthly to never think about again

My $40k MRR is 14 clients:

4 anchor clients on $5,000/mo retainers = $20,000

6 core clients on $2,500/mo retainers = $15,000

4 lite clients on $1,250/mo maintenance = $5,000

Keep that mix in your head. Every number in this article ties back to it

[ Let's break it down ] ↓↓↓

Why the Solo Agency Is Suddenly Possible (The Headcount Trap)

Here's the part nobody running an agency wants to say out loud

The bottleneck was never clients. It was delivery

You can fill a pipeline. Marketing is solvable, outreach is solvable, referrals are solvable. The thing that actually breaks agencies is what happens AFTER you sign the client. Someone has to do the work

In the old model, doing the work meant people. Sign 3 clients, hire a junior. Sign 5 more, hire two mid-levels and someone to manage them. Each hire is salary, onboarding, mistakes, sick days, churn, and the slow drift where you stop building and start managing humans who build worse than you do

And here's the trap inside the trap: every person you hire eats your margin. A staffed agency at $40k MRR is keeping maybe $10-12k after payroll, software, and overhead. The other $28k walks out the door in salaries. You're running a $40k business to take home what a senior contractor makes, except now you also have HR problems

So agencies plateau. Not because they can't get clients. Because every new client makes the operation heavier, slower, and thinner

The reason the solo agency works in 2026 is simple:

The production layer, the actual labor of delivering client work, can now be done by one model instead of a team. For me, that model is Kimi 2.6

Not the strategy. Not the client relationship. Not the judgment. Those stay with you. But the execution, the part that used to require a team of people grinding through tickets, that's the part that got automated

The agencies that win from here won't be the ones with the biggest teams

They'll be the ones who figured out that headcount went from being leverage to being a liability

That's the whole thesis. Everything below is how you actually do it

[ Start with the offer ] ↓↓↓

The Offer: What a Solo Agency Should Actually Sell

This is where most people screw it up before they even start

They sell custom work. "We do whatever you need." Every client is a unique snowflake, every project is built from scratch, and there's no way to systemize any of it. That's a recipe for a job, not an agency

If you want to run solo, your offer has to obey one rule:

The work has to be deliverable by a system, not by heroics

That means productized, repeatable, and scoped tight. Here's how I think about it

Pick work that is high-value to the client but mechanical to produce

Businesses pay shocking amounts of money for things that are genuinely repetitive once you've built the pattern once. A customer support agent that handles tier-1 tickets. A lead-routing system that pipes form fills into their CRM and books calls. A content engine that turns one podcast into 30 posts. These feel like magic to the client. To me they're a workflow I've run 40 times

Scope it so the deliverable is defined, not open-ended

"We'll automate your customer support" is a trap. "We'll build and maintain an AI support agent that resolves your top 20 ticket types, with a monthly retainer for tuning and new flows" is a product. The retainer is the key. It turns one-time builds into recurring revenue, which is the only way solo math works

Charge for outcome, deliver with system

Clients pay $2,500-$5,000/month because of what the automation saves them: a support hire they didn't make, hours their team got back, leads they stopped dropping. They are not paying for my time. They're paying for the result. That gap between perceived value and actual delivery cost is the entire business, and we'll get very specific about it in the economics section

The constraint that makes all of this work: every offer I sell, I've designed around what AI can reliably deliver. I don't sell things that require a human grinding for 40 hours. I sell things where I architect the solution once, and a model does the production. If a prospect wants something that can't route cleanly to a system, I either reshape the scope or I pass

Your offer is not "what can I do." It's "what can I systemize and sell on repeat."

[ Now the engine that makes it possible ] ↓↓↓

The Delivery Engine: How You Deliver Without a Team

This is the heart of the entire thing. If you only read one section, read this one

When a client request comes in, here's what it does NOT do: land on a person's desk to grind through manually

Here's what actually happens. Every piece of client work flows through a four-stage pipeline, and I'm only personally touching two of the four stages

Stage 1: Intake (me, 10 minutes)

I take the request, a new flow, a bug, a feature, a content batch, and I translate it into a clear spec. This is judgment work. What does the client actually need, what's the cleanest way to build it, what are the edge cases. This is the part that's still human, because it's where the value lives

Stage 2: Production (the model, not me)

The spec goes to my delivery stack, and this is where the actual labor happens. Writing the automation logic, generating the code, building the content, wiring the integrations, drafting the configs. The grunt work that used to need a team of juniors. This runs on AI, and the bulk of it runs on Kimi 2.6

Stage 3: QA (me, 15-20 minutes)

The model produces, I review. I'm not building from scratch, I'm checking output against the spec. Does it work, does it match what the client needs, does it handle the edge cases. Reviewing finished work is 10x faster than producing it, which is exactly why one person can carry the load of five

Stage 4: Handoff (mostly automated)

Deploy, document, notify the client. Templated, scripted, mostly hands-off

Notice the shape of this. The two stages that need a human, intake and QA, are the FAST stages. The slow, labor-heavy stage in the middle, production, is the one I removed myself from entirely. I went from being the laborer to being the orchestrator

Now here's where it gets specific, because "AI does the work" is a meaningless sentence without the actual mechanics

Why Kimi 2.6 is the engine

When I say the production layer runs on a model, I mean it runs, overwhelmingly, on Kimi 2.6. It is the single most important decision in my entire operation, and I want to be precise about why

Kimi 2.6 does the bulk of my delivery work, the code, the automation logic, the content generation, the integration wiring, at roughly $0.50 per million input tokens and $2 per million output. For the kind of production work an agency runs all day, that's about 6x cheaper than defaulting to a Sonnet-class model and 20-30x cheaper than running everything on a frontier model like Opus or GPT-5

And here's the part most people still haven't updated their mental model on: the shipped quality on this kind of work is indistinguishable. I'm not running cheap-and-worse. The automations Kimi 2.6 produces pass the same tests, ship to the same clients, and hold up in production exactly like output from a model costing 6x more. The "Kimi is the budget option" framing from 2025 is dead. In 2026 it's the default for serious production work, not the compromise

But cost is only half of why it's the engine. The other half is throughput

I run delivery for 14 clients. On a heavy day that's dozens of production jobs firing in parallel: a support agent rebuild for one client, a content batch for another, three bug fixes, a new integration. If I were running that volume on a frontier model, I'd hit rate limits by mid-morning and spend the rest of the day waiting. A model you can't access is a 0/10 model no matter how smart it is. Moonshot's rate limits are dramatically more generous, which means I can run my entire client load concurrently without getting throttled into a queue. The cheap model is also, in practice, the model that's actually there when I need it

Think about what that replaces. The production work Kimi 2.6 does for me is the work I would otherwise be paying 3-5 junior and mid-level people to do. That's $25-30k/month in salary, plus the management overhead, plus the mistakes, plus the churn. I replaced all of it with a model that costs me a couple hundred dollars a month and never has a bad week

That's not "AI helps me work faster." That's "Kimi 2.6 is my production team." Five people's worth of output for about $240 a month. That one line is the whole article

[ Meet the rest of the roster ] ↓↓↓

The Model Stack & Routing (Your "Team Roster")

I don't actually run everything on one model, and you shouldn't either. The right way to think about your stack is as a team roster. You're the founder, and you have different "hires" for different kinds of work, each priced to the job

Here's my roster

The senior workhorse: Kimi 2.6 (90% of delivery)

This is my lead engineer, my content team, and my automation builder rolled into one. Every routine production task defaults here: building flows, generating code, writing client content, wiring integrations, debugging, refactoring. It carries the overwhelming majority of the actual labor. ~$0.50/$2 per million. This is the hire that makes the whole agency profitable

The specialist: premium tier, Opus 4.6 or GPT-5 (the 10% that compounds)

Some decisions are too expensive to get wrong. Architecting a complex multi-system integration for an anchor client. A security-sensitive review before something touches a client's production data. A genuinely novel problem I haven't solved before. For that 10%, I route to a premium model and happily pay 20-30x more per token, because the cost of a wrong answer here is a blown client relationship, not a $0.04 retry. You pay the specialist for the decisions that compound

The intern: cheap/local tier (cleanup)

Formatting, simple renames, boilerplate, first-draft scaffolding, trivial single-step tasks. Runs on a cheap utility model or a local model on my own machine for $0. No reason to pay real money for work that doesn't require thinking

The routing logic is your org chart

Here's roughly how a task gets assigned:

Is this a high-stakes architecture or security decision for an anchor client? → premium tier

Is this real production work (building, coding, content, automations, debugging)? → Kimi 2.6

Is this a long multi-step agentic job running many iterations? → Kimi 2.6 (the per-step cost advantage compounds hard across iterations)

Is this cleanup, formatting, or boilerplate? → cheap/local tier

Cost per real client task

Here's what the economics look like per real client task, illustratively (your numbers will vary by task and codebase, but the SHAPE is the point):

Run those numbers across hundreds of tasks a month, across 14 clients, and the difference between defaulting to a premium model and defaulting to Kimi 2.6 is the difference between an inference bill that eats your margin and one that's a rounding error

The mistake most people make is hiring one expensive "employee" (running everything on a premium model) for every single task, including the ones an intern could do. The smart move is the roster: the right model for each job, with Kimi 2.6 doing the bulk because that's where the cost-to-quality math wins for real work

[ Now the part that actually matters, the money ] ↓↓↓

The Unit Economics (Why You Keep 90% of Every Check)

This is the section that separates this from every "start an agency" guide you've ever read

Let's do the real math. $40k MRR. Here's where it goes every month

Revenue: $40,000

Costs:

Kimi 2.6 inference (the bulk of all delivery): ~$240

Premium model for the 10% of high-stakes work: ~$110

Cleanup/local tier: ~$0

Infrastructure (hosting, automation platform, vector DB, servers): ~$180

Tools/SaaS (CRM, scheduling, comms, misc): ~$220

Total monthly opex: roughly $750

Read that again. I deliver $40,000 of client value on about $750 of cost. My delivery inference, the thing that replaced a 5-person team, is under $300 of that

That's a margin north of 90%. Realistically, after I account for payment processing, the occasional contractor I bring in for a true edge case, and taxes, I'm still keeping a margin that a staffed agency literally cannot reach, because their single biggest line item, payroll, is one I don't have

Now here's the contrast that makes the whole thing land. Watch what happens if I run that exact same delivery load on a frontier model instead of Kimi 2.6

The production work that costs me ~$240/month on Kimi 2.6 would run roughly 6x higher on a Sonnet-class default, and 20-30x higher on Opus or GPT-5 for the heavy agentic loops. Call it $1,500-$5,000+/month depending on the mix. Suddenly my delivery cost isn't a rounding error, it's a real expense that scales with every client I add. Add the rate-limit problem (where I physically can't run my whole load on a frontier model without throttling) and the model breaks entirely at this client count

The 90% margin doesn't exist because I'm clever about pricing. It exists because my delivery runs on Kimi 2.6. Swap the engine and the entire business model collapses back into "normal agency that has to hire to grow"

That's the unlock. The reason I can stay solo at $40k MRR is that the cost of delivering the work fell through the floor while the value clients pay for stayed exactly the same. I'm pocketing the spread

[ Now let's fill the pipeline ] ↓↓↓

Client Acquisition Without a Sales Team

A 90% margin means nothing if you can't get clients. And no, I don't have a sales team. Here's how the pipeline actually fills, as one person

Inbound from content (my biggest channel)

I post about the work. Case studies, before/afters, "here's an automation that saved a client 20 hours a week" breakdowns. When you show real outcomes publicly, the right clients self-select and come to you pre-sold. This is slow to start and compounds forever. It's the single best use of the time AI freed up. I'm not grinding delivery anymore, so I can spend that energy being visible

A niche offer that sells itself

Because my offer is productized and specific, I'm not explaining a vague "we do AI stuff." I'm saying "I build AI support agents for X type of business, here's exactly what it does, here's what it costs, here's the result." Specific offers close themselves. Generic ones require convincing

Referral loops baked into delivery

Every happy client knows other business owners with the same problem. I make referrals frictionless: a simple ask at the moment a client is most delighted (right after a win), and a small incentive. At a 90% margin I can afford to be generous with referral rewards, which most agencies can't

Lightweight outbound

A targeted list, a sharp personalized message, a specific offer. I keep this small and surgical. I don't need 1,000 leads, I need to add one or two retainers a month. I'll use AI to help draft and personalize outreach at volume, but the targeting and the relationship stay human, because that's what actually closes

The whole philosophy here: keep acquisition light enough that it doesn't eat the time the delivery engine gave back. The trap would be to free up all that capacity and then drown myself in a heavy sales operation. I don't need volume. I need a steady trickle of the right clients into a system that delivers them at near-zero cost

[ Now let's scale without breaking ] ↓↓↓

Staying Solo at Scale (Systems, Not People)

Here's the real danger once this starts working: you accidentally rebuild a job for yourself

You sign client 8, then 10, then 14, and even with AI doing production, the intake and QA start to pile up until you're back to working 12-hour days. The goal was never "do everything myself faster." It's "build a system that needs less of me over time." Here's how I keep it solo at 14 clients without losing my mind

Graduated skills: solve once, reuse forever

Every workflow I solve, I save. The first time I build a client support agent, it's real work: spec it, build it, QA it, deploy it. But I capture that entire process as a reusable skill: the prompts, the configs, the structure, the edge cases. The next time a client needs something similar, the system loads the skill and skips the discovery phase entirely. My 5th support-agent build costs me a fraction of the time and tokens of my 1st, because I'm not re-figuring-out anything. The agency gets faster and cheaper with every job it does

Background agents running delivery 24/7

A lot of my client work isn't a one-time build, it's ongoing. Monitoring, content generation, data processing, routine maintenance flows. These run as background agents on Kimi 2.6, continuously, while I sleep. Running persistent 24/7 agents is only economically sane because the per-token cost is so low. The same agents on a frontier model would cost hundreds per month each and the whole thing wouldn't pencil out. On Kimi 2.6, I can have continuous delivery running across every client for the cost of a couple dinners

Swarms: when one agent isn't enough

This is the part that actually unlocks running 14 clients without breaking. Kimi 2.6 ships with something Moonshot calls Agent Swarm. Instead of one agent grinding through every step of a job in sequence, the main agent splits the work into smaller pieces and runs up to 300 sub-agents in parallel, coordinated across 4,000 steps

The thing I like about it: the main agent picks its own workers on the fly. You're not pre-defining roles like "this one's the coder, this one's the QA, this one's the writer." It looks at the spec, decides what sub-jobs it needs, and spawns them. Moonshot calls it an AI-designed org chart instead of a human-designed one. Less wiring from me, more flexibility on every job

What that looks like in practice for the agency:

A monthly content batch for a client stops being "wait while Kimi writes 30 posts one by one." The main agent fans out, 15-20 sub-agents draft in parallel, another batch QAs them against the brand voice, a final one packages the output. The whole batch finishes in the time a single agent used to spend on the first 3 posts

A complex integration build splits into "spec the auth layer," "wire the webhook," "write the tests," "draft the docs," all running at once. I review the merged output instead of waiting through a serial pipeline

For monitoring work, sub-agents can sit on different parts of a client's system at the same time. One watches the support queue, one watches the error logs, one watches the data pipeline. They report up to a coordinator agent that only pings me if something actually needs human eyes

The reason this matters for solo math: Swarms is what lets one person run the parallel load of an actual team without queuing. Moonshot has shown internal Swarm runs going for hours, and in one case 5 straight days, handling incident response autonomously. That's not "AI helps me," that's "AI runs the night shift"

And the cost story still holds. A 300-agent Swarm sounds expensive until you remember each sub-agent is running on Kimi 2.6 economics. A run that would cost triple-digit dollars orchestrated on a frontier model often comes in under $5 here. The cheap per-token cost is what makes the swarm economically possible in the first place

My role shrinks to judgment and relationships

As the systems mature, what's left for me is the stuff that should stay human: the intake judgment, the QA eye, and the client relationships. That's it. I'm not the laborer, I'm not even really the builder anymore, I'm the architect and the quality bar. That's a role one person can hold across a lot of clients

The principle: every time something becomes repetitive, I systemize it instead of doing it again. People scale by adding humans. I scale by adding skills and agents. One of those compounds your costs. The other compounds your leverage

[ Now the honest part ] ↓↓↓

When to Actually Spend More (The Honest Limits)

I'm not going to pretend this is magic with no edges. If I did, you'd hit the limits yourself and feel lied to. So here's where the solo-on-Kimi model genuinely strains, and what I do about it

Some work needs the premium tier, not Kimi 2.6

For the 10% of work that's high-stakes (complex architecture for an anchor client, security-sensitive logic touching production data, a genuinely novel problem) I route to a premium model and pay the premium happily. The rule I use: if the cost of a wrong answer is more than 100x the model cost difference, use the expensive model. A blown integration on a $5k/month anchor client costs me far more than the few dollars I'd save running it cheap. Price the model to the cost of failure, not the cost of the call

Some work needs a human, not a model

Deep client strategy, a delicate relationship moment, a creative direction call, a true one-off that doesn't fit any system: that's me, or occasionally a contractor I bring in for a specific gap. I don't force AI onto work that genuinely needs human judgment just to protect the "solo" label

There's a real ceiling on QA capacity

This is the honest constraint. Even with production automated, I can only personally QA so much volume before quality slips. Right now 14 clients is comfortable. Somewhere north of that, I'd either need to raise prices and cap client count (my likely move), or bring in one trusted person purely for QA. Notice that's the FIRST hire I'd ever consider, not a producer, a quality checker, because production is the part that's solved

Naming these limits isn't a weakness in the model. It's what keeps it credible and keeps it working. The point isn't "never spend money." It's "spend it only where it actually buys you something the system can't"

[ Now let's get you started ] ↓↓↓

The 90-Day Plan to Your First $10k MRR

You don't build $40k MRR in a weekend. But you can build the engine and your first few retainers in 90 days if you move in the right order. Here's the rollout

Phase 1 (Days 1-30): Nail the offer and land client #1

Pick ONE productized service in a niche you understand. Resist the urge to offer everything

Define the exact deliverable, the scope boundaries, and the retainer price. Write it down like a product page

Build the offer's core delivery once, by hand if you have to, so you understand the work intimately

Get your first client. Discount the first one if you must, in exchange for a case study and a testimonial

Milestone: 1 client, ~$1,500-2,500 MRR, and deep understanding of the work

Phase 2 (Days 31-60): Build the delivery engine

This is the structural phase. Turn the manual delivery from phase 1 into a system

Set up your model stack with Kimi 2.6 as the default workhorse for all production work. This is the single highest-leverage move in the entire 90 days. It's what makes every future client profitable instead of just billable

Route the 10% of high-stakes work to a premium model, and trivial cleanup to a cheap/local tier

Capture your first workflow as a reusable skill so client #2 is faster than client #1

Add 2-3 more clients using the case study from phase 1

Milestone: 3-4 clients, ~$8-12k MRR, a real (if rough) delivery engine

Phase 3 (Days 61-90): Systemize and compound

Every new client, capture the work as a skill. Your library of repeatable solutions grows

Move ongoing client work to background agents running on Kimi 2.6 so delivery happens without your hands on it

Start the content/referral flywheel for inbound, now that you have wins to show

Tighten QA into a fast, repeatable checklist so your time-per-client keeps dropping

Milestone: 5-7 clients, $10k+ MRR, and a system that needs less of you each week

From there it's repetition. Every client gets cheaper to deliver, your skill library gets deeper, your content pulls more inbound, and the margin stays north of 90% the entire way up because the delivery engine never needs a payroll

[ Your first move ] ↓↓↓

Do This in the Next 30 Minutes

You don't need a whole agency to feel what I'm talking about. You need one real task running on Kimi 2.6 today

Here's the 30-minute version:

Grab a Kimi 2.6 API key from Moonshot

Point a tool you already use (n8n, Make, Cursor, Claude Code, your own scripts) at it as a custom model

Take the most repetitive, token-heavy task on your plate right now and run it through Kimi 2.6 instead of whatever you default to

Then check two things: did the output ship, and what did it cost

Here's the minimal routing setup I started with, long before I had 14 clients:

That's the whole thing. Default to Kimi 2.6, send the rare high-stakes call to a premium model, dump cleanup on something free

I'm not asking you to take my word for any of this. Run one real task through Kimi 2.6 and look at the result the next morning. The output ships and the bill barely moved. Once you've seen that happen with your OWN work, paying 6x for the same outcome just starts to feel stupid

That's the whole reason I default to it. Not loyalty. Math

[ The bigger picture ] ↓↓↓

The Bigger Picture

For a hundred years, the way you scaled a service business was people. More revenue meant more headcount, and the founders who won were the ones who could recruit, manage, and retain the biggest, best teams

That era is ending faster than anyone is admitting

When the production layer of your business can run on Kimi 2.6 for a couple hundred dollars a month instead of a team that costs thirty thousand, headcount stops being leverage and becomes a liability. The agency with 20 employees isn't more powerful than the solo operator with a sharp system. It's slower, heavier, and running on a fraction of the margin

In 2027, the gap between the agency owner clearing $12k on $40k MRR and the one clearing $37k on the same revenue won't be talent. It won't be clients. It'll be whether they figured out that delivery got automated and built their operation around that fact, or whether they kept hiring like it was 2019

I run an agency that does $40k MRR with no employees because I stopped trying to build a team and started building a system. Kimi 2.6 is the engine that does the work. I'm the one who decides what work to do

You're not too late to do this. You're early. Most people still think you need a team

Prove them wrong ❤️

How I Run an AI Agency Solo (No Employees, $40k MRR)

Related

The Ultimate Guide: Claude + Premiere Pro + YouTube = $37K/MONTH

Claude Design Built My Brand End to End (without hitting limits)