How Themis went from writing code to running the company

Six weeks ago, four engineers were the only people using Themis. Today, 34 people across engineering, operations, and customer service use it daily. Over half our code commits now have an AI co-author.

This wasn't the plan. We built Themis as an AI engineer — one that turns chat messages into pull requests. But once the rest of the company saw engineers shipping code from a Teams message, they started asking: "Can it do that for my work too?"

This is what happened when we said yes.

The numbers: one month of Themis in production

Before the story, the data. February was Themis's first full month across the team:

Metric	Value
Agent runs	2,594
Completion rate	97.5%
Total cost	$1,577
Code generations	383
PR reviews	174
Activities tracked	10,082

The number that matters most: AI coding penetration crossed 50%. More than half our commits are now co-authored by Themis. Not autocompleted suggestions that a developer accepted — complete implementations generated from issue descriptions, reviewed by humans, and merged through our standard PR workflow.

At $1,577 for a month of AI agent operations across the entire company, the economics barely register against a single engineer's salary. The bottleneck has shifted from "can we afford this" to "how fast can we expand what it covers."

What we shipped: from chat to platform

When we introduced Themis in February, it could generate code, review PRs, and respond to @mentions. Since then, we've shipped three major capabilities that changed who uses it and how.

Two-tier intelligence: fast by default, deep when needed

The first problem we hit at scale was cost and latency. Not every message needs a $0.15 agent run. When someone says "good morning" or asks "what's our sprint goal?", spinning up a full AI agent with tool access is wasteful.

We split Themis into two tiers. Tier 1 handles ~80% of interactions — simple questions, status lookups, casual conversation — in under one second at minimal cost. Tier 2 activates only when the request involves real engineering work: reading code, querying data, generating implementations.

The handoff is invisible. Users experience it as Themis "thinking a moment longer" on harder questions. This single architectural decision made it economically viable to put Themis in every team channel, not just engineering ones.

Automations: event-driven workflows that run themselves

The second unlock was automations — scheduled and event-triggered AI workflows. A daily digest that summarizes overnight Sentry alerts. A weekly report that pulls support SLA metrics from Metabase and flags breaches. A workflow that watches for merged PRs and updates the relevant Linear issues.

102 automation executions ran in the first month, with a 98% completion rate. These aren't cron jobs running scripts — they're AI agents with full context, reasoning about the data before deciding what to surface.

Skills architecture: teaching Themis new domains

The third — and most consequential — change was skills. We refactored Themis from a monolithic agent into a skills-first architecture: portable, domain-specific knowledge modules that any agent run can load on demand.

This means we can teach Themis a new domain — product planning, growth strategy, real-estate marketing, support SLA monitoring — without touching the core platform. Write a skill file, define the tools it needs, and Themis can reason about that domain in any conversation.

This is the architectural decision that made the next section possible.

Breaking out of engineering

The most surprising thing about the past six weeks wasn't a feature we shipped. It was watching non-engineers adopt Themis faster than we expected — and for use cases we didn't design for.

Operations: from days of work to minutes of conversation

Our CS team in Penang manages service quality across Asia — thousands of listings, multiple OTA platforms, multiple countries with different standards. Tracking SLA compliance, planning shift coverage, and spotting service degradation used to involve pulling data from multiple tools, cross-referencing spreadsheets, and hours of manual review.

Now they ask Themis. Themis queries our operational reporting database directly — occupancy rates, cleaning schedules, guest review scores, response times — and returns answers in the same Teams channel where the team coordinates daily work. One question, one answer, move on.

The shift planning alone — which used to take a manager half a day of spreadsheet work — now happens through a conversation.

Cross-team knowledge: the meeting minutes that became a sprint ticket

Here's a scenario that happened organically. An operations team in Japan held their weekly meeting and documented feedback about a booking flow issue in their Google Drive meeting notes. No one on the engineering team was in that meeting.

Through the Google Drive integration, Themis was aware of those meeting minutes and offered to create a Linear issue for the next sprint backlog. Context that would have been lost in a document folder became an actionable ticket in minutes.

This is the kind of cross-team visibility that usually requires a dedicated program manager. We got it as a side effect of connecting Themis to where teams already store their knowledge.

Customer service: bug reports from a Teams message

This interaction captures the shift better than any architecture diagram. A CS team member spots a bug in production, @mentions Themis in the same Teams channel where they're already discussing it, and Themis:

Creates a structured bug report in Linear (PIP-1259)
Automatically links related issues it finds (PIP-1200, PIP-1206)
Assigns it to the right engineer
Updates the ticket when more context arrives

No one taught Jocelle how to file an engineering ticket. She just described the problem in her own words, in the tool she already uses, and the right things happened. That's the standard we're designing for.

Executives: budget and forecasting in real-time

Our management team now uses Themis for budget reviews and business forecasting. Through the Metabase integration, Themis pulls KPIs — revenue by region, occupancy trends, operational costs — and combines them with sales pipeline data to build a complete picture.

What used to require collecting reports from multiple departments, waiting for someone to build a deck, and scheduling a review meeting now happens as a conversation. Ask a question, get the number, ask a follow-up, drill deeper. The time from "I want to know X" to "now I know X" collapsed from days to minutes.

Three lessons from scaling AI across a company

1. Meet people where they already work. The single biggest factor in adoption wasn't AI capability — it was accessibility. Themis succeeds because it lives in Teams and Telegram, not in a new app people have to learn. Every new tool you ask someone to open is friction. Every integration into an existing tool is adoption.

2. Economics determine scope. The two-tier architecture wasn't an optimization — it was an unlock. At $0.60 per agent run, you'd only use Themis for high-value engineering tasks. At $0.02 for simple interactions, you put it in every channel and let people discover use cases you never planned for.

3. AI adoption is a trust curve, not a launch event. Nobody went from "what is this" to "I depend on it" overnight. Engineers trusted it first because they could read the code it wrote. Managers trusted it when the numbers it pulled matched their spreadsheets. CS trusted it when the tickets it created were actually well-structured. Each successful interaction builds the next one.

One month in, 34 people, $1,577. The numbers are small. The shift is not. When AI stops being a developer's tool and becomes a team's infrastructure, the bottleneck moves from "who can code this" to "who can describe what they need." That's a fundamentally different company.

We're just getting started.