Case Study: The PM Agent Team

This is the story of one Compound Sprint, start to finish, with every artifact shown. We replaced a project coordinator role with an agent team — five AI agents handling the coordination work that had consumed twenty hours a week of human labor. It took about six weeks from Signal to Deliver, with a platform pivot in the middle that nearly derailed the whole thing. Every stage of the Compound Sprint is here with its artifacts and design decisions. If you’ve read the framework chapters and wondered what a full sprint looks like in practice, this is it.

The role belonged to a subcontractor we’ll call Rachel. She was hired in August 2024 for graphic design and content management, then transitioned into project coordination by September. The transition happened fast — within weeks she’d moved from creating visual assets to owning the operational backbone of how work moved through our project management system. By the time we decided to replace the role with agents, her responsibilities had expanded to cover project tracking, scorecard updates, meeting facilitation, social media scheduling, task creation from transcripts, video editing coordination, and QA work alongside our VP of Operations, Sofia.

The decision to replace the role wasn’t just structural. The work was overwhelmingly rule-based — the kind of work agents handle well. And Rachel’s performance had become a problem. She was slow, non-responsive, and increasingly free with her time without explanation. Sofia flagged it directly in one of our design sessions: “A lot of things are not done properly” and “it’s becoming a habit for her to be very free with her time and not give any explanations.” The constraint was the combination: rule-based work that agents can handle, performance issues with the person doing it, and $24K/year in subcontractor cost while we were falling behind on projects.

Signal: Naming the Constraint

The constraint showed up before it showed up in any conversation. We were getting behind on projects, and that was impacting our billing — we couldn’t bill because we weren’t finishing work on time, so cash flow was suffering.

But the constraint wasn’t just financial. Sofia had been watching Rachel’s output deteriorate. She put it plainly: “She has a high salary, and for somebody that is working 20 hours a week, that’s not fair.” She went further: “If I don’t or we don’t see any change in the next few weeks, I wouldn’t even keep her.”

There wasn’t a single “this is the constraint” moment. The signal accumulated. Revenue was falling short. Rachel’s performance was declining. And the more we looked at what the role actually entailed, the clearer it became that most of the work was information movement — taking data from one place, putting it in another, following rules that didn’t require judgment. Sofia and I had been having versions of this conversation for months before we formalized it.

Artifact: Constraint Statement

Field Detail
Constraint A $24K/year subcontractor role is consuming budget on work that is 70-80% rule-based data operations (CRM updates, scorecard maintenance, task creation, scheduling), while the person in the role is underperforming and the company is falling behind on project delivery — causing billing delays and cash flow pressure.
Cost $24K/year subcontractor cost + opportunity cost of not reallocating that budget to revenue-generating work + cost of slow/missed coordination on active projects
Owner Jesse Flores / Sofia Reyes (VP of Operations, Human Orchestrator)
Evidence Multiple leadership conversations documenting staffing issues, cash flow impact from delayed project delivery, and Sofia flagging Rachel’s performance and time management

Source: Mapping the Knowledge

Sofia was the key to this stage. She had trained Rachel, created her 90-day plan, and was accountable for operations and project delivery. If anyone knew what Rachel actually did — not the job description version, the real version — it was Sofia.

The source mapping happened organically through our design sessions. Sofia and I sat down for a two-hour working session that became the origin story for the PM Agent Team. We pulled up the Hybrid Accountability Chart, started walking through what needed to happen, and the picture became clear fast. Sofia summarized the shape of the work: “A lot of the effort that we expend is literally moving information from one place or one person to another. And there’s not really a lot of — there’s no value add other than coordination.”

That was the Source insight. The knowledge domains weren’t exotic. They were data operations wrapped in a job title.

But when we started listing what Rachel “did,” we realized we were conflating different things. Her accountability chart had roles on it — project coordination, QA support, content management — and those roles contained tasks, and the tasks had different characteristics. We needed to pull those apart before we could design anything.

Artifact: Roles, Tasks, and Accountability Deconstruction

Role (from Accountability Chart) Tasks within Role Task Type Judgment Required Current Owner
Project Coordination Update CRM project records Data entry None — rule-based Rachel
Create tasks from project charters Pattern extraction Low — follows templates Rachel
Assign tasks to team members Decision-making Medium — requires capacity awareness Rachel / Sofia
Flag overdue and blocked work Monitoring None — data-driven Rachel
Manage project handoff process Process execution Low — follows checklist Rachel
Scorecard & Reporting Pull weekly scorecard numbers Data aggregation None — rule-based Rachel
Enter numbers into scorecard template Data entry None — rule-based Rachel
Interpret variance and flag issues Analysis Medium — requires context Sofia
Meeting Support Schedule and send agendas Scheduling None — rule-based Rachel
Capture action items from transcripts Extraction Low — pattern matching Rachel
Create CRM tasks from action items Data entry None — rule-based Rachel
Content & Asset Management Create visual assets Creative production High — brand judgment Rachel / Lucas
Schedule social media posts Scheduling Low — follows calendar Rachel
Coordinate video editing workflow Creative coordination Medium — editorial judgment Rachel / Lucas
Client Operations Support client onboarding process Relationship + process High — relationship context Rachel / Sofia
Manage client communication cadence Relationship High — trust and context Sofia
Quality Assurance Execute QA checklists Process execution Medium — judgment calls Rachel / Sofia

The deconstruction made the design decision obvious. The roles that looked like one job were actually a collection of tasks with wildly different judgment requirements. The low-judgment, rule-based tasks clustered together — and they were the majority of Rachel’s hours.

Artifact: Knowledge Map

Source Type Owner Status Pipeline Notes
CRM project records Digital Rachel (maintained) / Sofia (accountable) Clean Connected — API available Core system of record for all project tracking
Scorecard data Digital Rachel (updated weekly) Clean Manual — Rachel pulled and entered numbers Numbers came from multiple sources; Rachel was the aggregator
Meeting transcripts Digital Automated (transcription service) Needs Work Manual — Rachel read transcripts and created tasks by hand Unstructured; required human interpretation to extract action items
Visual asset files Digital Rachel Clean Google Drive / Canva Original hire scope; became secondary after coordinator transition
Client onboarding process Organic Sofia / Rachel Needs Work Manual Some documented, some in Rachel’s head
QA checklists and processes Organic Sofia / Rachel Partially documented Manual Sofia and Rachel coordinated; Sofia retained the judgment calls
Project handoff procedures Organic Rachel Needs Work Manual Rachel had a documented handoff process but unclear how complete
Client relationship context Organic Rachel At Risk None — lived in Rachel’s head Which clients needed extra lead time, unwritten rules per account
Video editing workflow Organic Rachel / Lucas Coordination role Manual Rachel was coordinating editors, not editing herself

Artifact: Source Classification

Source Structured / Unstructured Durable / Ephemeral AI Tier
CRM project records Structured Durable Tier 1 — AI can use directly via API
Scorecard data Structured Durable Tier 1 — once pipeline is automated
Meeting transcripts Unstructured Durable Tier 2 — AI can process with extraction pipeline
Visual asset files Unstructured Durable Tier 3 — creative judgment required for creation; storage/retrieval is Tier 1
Client onboarding process Unstructured Durable (if documented) Tier 2 — needs capture before AI can execute
QA checklists Semi-structured Durable Tier 1 — if documented; Tier 3 if judgment-dependent
Project handoff procedures Semi-structured Durable Tier 2 — needs formalization
Client relationship context Unstructured Ephemeral Tier 3 — requires human capture; some elements can become Tier 2
Video editing workflow Unstructured Durable Tier 2 — coordination is automatable; editorial judgment is not

Artifact: Source Completeness Test

Check Result
Designer test. Could someone who wasn’t in the room look at this map and know what they’re designing against? Yes — sources, types, owners, and pipeline status are all named. The Roles/Tasks deconstruction makes it clear which work is rule-based and which requires judgment.
Constraint scope test. Every row connects to the constraint. Yes — every source feeds Rachel’s coordination work. No extraneous systems included.
Pipeline test. Every digital source has a pipeline status. Yes — CRM is Connected, scorecard is Manual, transcripts are Manual, assets are Connected.
At-risk test. Single points of failure are named. Yes — client relationship context is flagged as At Risk (lives in Rachel’s head, no capture pipeline).
Gap test. The Missing column isn’t empty. Gap identified: no historical data on task estimation accuracy (would validate whether AI-generated estimates match human judgment). No documented exception-handling rules for client-specific processes.
One-page test. The map fits on one page. Yes — scoped to the coordination constraint, not a general data audit.

Design: Allocating the Work

The design happened across two sessions — both with Sofia, totaling about three hours.

The first session was the big one. We pulled up the Hybrid Accountability Chart, started walking through every function, and Sofia started articulating the shape of the problem. The conversation wasn’t “who should we hire?” It was “what does this role actually do, and how much of it requires a human?”

I framed it as a manufacturing problem. If you think about digital work the way you think about physical production — input, transformation, output — then most of what Rachel was doing was assembly-line work:

“If you thought about digital work as being manufactured the same way you would physical work — if we thought about making a car — you’ve got to source raw materials, you’ve got to have some machining done, you’ve got to start assembling different pieces. The only reason robots work in a factory is because the processes are so well thought through that it’s really easy for a robot to say, once something comes here, I get this thing, I do this thing, and then I move it over here. Input, transformation, output.”

That framing — the information factory — became the mental model for how we designed the agent team. If you could map the inputs, transformations, and outputs for each piece of Rachel’s work, you could figure out which robots to build.

Sofia got it immediately: “I think if this goes well, it will be only a matter of figuring out — yeah, doing a couple of iterations. But I think this is where you can see, man, there’s so many interesting applications that could work for so many different kinds of companies.” She also flagged the prerequisite: “It kind of depends how well you manage your initial data. We know we have almost everything in the CRM. So that makes it easier. I think other companies will have to face that question first — where’s our information? How good is it?”

In that session, I designed four agents on a whiteboard. I didn’t just describe them — I designed them. The difference matters. Describing an agent is saying “we need something that creates tasks.” Defining an agent means specifying its inputs, outputs, triggers, and guardrails. Every agent went through both stages in that session.

Artifact: Agent Description to Definition

Agent Described (Conceptual) Defined (Spec’d)
Task Agent “We need something that creates tasks from project charters” Input: Project charter + milestone definitions from CRM API. Output: Complete task descriptions with SMART outcomes, point estimates, dependency ordering — written back to CRM via API. Trigger: Daily poll at 7:15am for projects needing tasks. Guardrails: Cannot modify existing tasks; creates only. All generated tasks posted to Slack for review.
Coordination Agent “We need something that assigns work and flags problems” Input: All open tasks + team capacity data + skills matrix from CRM. Output: Task assignments, conflict flags, daily summary posted to Slack, review task for Sofia. Trigger: Daily at 7:45am. Guardrails: Cannot override Sofia’s manual assignments. Excluded users list (Sofia, Jesse). Creates a daily review task so Sofia sees everything before the team does.
Reporting Agent “We need something that generates scorecards automatically” Input: CRM project data + time tracking data (Kimai). Output: Daily reports (completed tasks, overdue items, open cases) + weekly reports (cost variance, schedule variance, team productivity). Uses earned value management: EV = estimated points x completion %, AC = hours logged. Trigger: Daily at 7:30am, weekly summary on Fridays. Guardrails: Read-only access to source systems. Reports posted to Slack, not sent to clients.
Garbage Collector “We need something that cleans up bad data” Input: Full scan of CRM task and project descriptions. Output: Flagged items with specific quality issues (missing descriptions, incomplete project definitions, unestimated tasks). Trigger: Daily at 7:00am. Guardrails: Cannot modify data — flags only. Sofia reviews flagged items and decides what to fix.

The second session continued the design work. We reviewed the diagram, talked through capacity planning, and I pushed Sofia to think about how the agents would need data pipelines — where each agent gets its input and where it sends its output. Sofia raised the practical concern about the coordination agent needing to plan ahead: “I gave it a couple weeks, and for now, I think that’s enough.” I’d suggested 90 days; she was right to scope it down for the first version.

I named the whole thing on the whiteboard: “PM Agent Team.” Sofia would take the design and build from where I left off.

Artifact: Work Deconstruction

Task Before (Owner) After (Owner) Source / Input Output / Destination Rationale
CRM record updates Rachel (manual) Agent — Task Agent + Coordination Agent Project charters, milestone definitions via CRM API Updated CRM records, Slack confirmation post Rule-based data entry; structured source, API available
Scorecard maintenance Rachel (manual pull + entry) Agent — Reporting Agent CRM project data + Kimai time tracking data Scorecard reports posted to Slack + archived in CRM Data aggregation from known sources; no judgment required
Task creation from transcripts Rachel (manual) Agent — Task Agent Project charters in CRM SMART task descriptions written to CRM via API + Slack summary Pattern extraction from project charters; AI handles well
Social media scheduling Rachel (manual) Agent — Content Scheduling Agent Content calendar + approved assets in Google Drive Scheduled posts in social platform + confirmation to Slack Calendar-based; content decisions stay human, scheduling doesn’t
Meeting coordination Rachel Agent — Coordination Agent Calendar events + agenda templates Agendas sent to attendees + action items extracted to CRM tasks Scheduling and agenda prep are rule-based; facilitation judgment stays human
Client onboarding coordination Rachel / Sofia Human (Sofia) + Agent assist Client intake form + onboarding checklist in CRM CRM client record + welcome sequence triggered + onboarding tasks created Relationship component stays human; checklist execution goes agent
QA coordination Rachel / Sofia Human (Sofia) QA checklists + project deliverables QA sign-off records in CRM Judgment-driven; Sofia retained full ownership
Visual asset creation Rachel Agent with Lucas review Brand guidelines + creative brief Draft assets in Google Drive for Lucas review AI generates, human reviews for brand consistency
Video editing coordination Rachel Human (Lucas) Raw footage + editorial direction from Sofia Edited video delivered to client Creative coordination requires relationship and editorial judgment
Project handoff management Rachel Agent — Coordination Agent Handoff checklist + project records in CRM Handoff completion record in CRM + notification to receiving team Process-driven; automated if procedures are documented
Client relationship management Rachel Human (Sofia / account manager) Client communication history + CRM contact records Client emails, meeting notes logged in CRM Judgment, trust, context — stays human

Artifact: Hybrid Accountability Chart Entry

Function Human Role Who Agent Role Which Agent
Task creation and estimation Reviews and approves generated tasks Sofia Generates tasks from charters, estimates points, orders dependencies Task Agent
Daily work assignment Makes strategic prioritization calls Sofia Assigns tasks based on skills/capacity, flags conflicts Coordination Agent
Progress reporting Interprets reports, makes decisions Jesse / Sofia Generates daily and weekly variance reports Reporting Agent
Data quality Reviews flagged items Sofia Scans for incomplete descriptions, bad data Garbage Collector
Client communication Owns all client relationships Sofia Drafts status updates for review Coordination Agent
QA coordination Full ownership Sofia None N/A
Visual assets Reviews for brand consistency Lucas Generates drafts AI generation tools
Video editing Full ownership of creative direction Lucas None N/A
Strategic decisions Capacity allocation, project prioritization Jesse / Sofia Provides data for decisions Reporting Agent

Artifact: Information Flow — Client Communication

The Client Communication row in the Hybrid Accountability Chart involves the tightest collaboration between human and agent. Sofia owns all client relationships, but the Coordination Agent drafts status updates, pulls project data, and prepares the information Sofia needs to communicate. The swim lane below shows how information moves through that accountability.

PM Agent Team: Client Communication Swim LaneAccountability: Client Communication. Autonomy level: AI-Assisted.HUMANAGENTReceives clientrequestHANDOFFMonitorschannelsDraftsresponseHANDOFFReviewsagent draftSends finalcommunicationHANDOFFLogsinteraction
Swim lane diagram: Client Communication accountability — human lane and agent lane with handoff points

The handoff points are where governance matters most. The Coordination Agent can pull data and draft — it cannot send anything to a client. Sofia reviews every outbound communication. That boundary is non-negotiable.

Artifact: Governance and Guardrails

Domain What Agents Can Do What Agents Cannot Do Escalation Trigger Review Cadence Kill Switch Condition
Task creation Generate task descriptions with SMART outcomes, point estimates, and dependency ordering from project charters Modify or delete existing tasks; override manually created tasks Task Agent generates a task with zero confidence on scope or estimation Sofia reviews generated tasks daily via Slack summary Agent enters an infinite loop or generates more than 50 tasks in a single run
Work assignment Assign tasks based on skills matrix and capacity data; flag conflicts Override Sofia’s manual assignments; assign work to excluded users (Sofia, Jesse) Coordination Agent detects a scheduling conflict it cannot resolve Sofia reviews assignments daily before the team sees them Agent assigns work to excluded users or assigns the same task to multiple people
Reporting Generate daily and weekly variance reports using earned value management Send reports directly to clients; modify source data in CRM or time tracking Report shows variance exceeding 20% on cost or schedule Jesse and Sofia review weekly reports every Friday API token spend exceeds $50 in a single day or agent fails to post for two consecutive days
Data quality Scan for incomplete descriptions, missing estimates, and data quality issues Modify any data — flagging only Garbage Collector flags more than 30% of active tasks as deficient Sofia reviews flagged items weekly Agent attempts a write operation on any record
Client communication Draft status updates; pull project data for Sofia’s review Send any communication to a client; access client email directly Draft contains language Sofia hasn’t approved or references confidential data Sofia reviews every draft before sending Agent sends any outbound communication without human approval
Reminders Send personalized Slack DMs to team members listing overdue and blocked tasks Contact anyone outside the internal team; escalate on its own Team member reports receiving inaccurate or duplicate reminders Sofia reviews reminder logs weekly Agent sends more than 10 DMs to a single person in one day

Artifact: Design Brief — PM Agent Team

Design Brief — PM Agent Team

This is the design brief as it existed when we moved from Design into Build. It captures the full picture of what the agent team would do, who it affected, and what it needed to work.

Workflow Summary

The PM Agent Team replaces the coordination work previously done by a subcontractor. Five agents run on weekday mornings between 7:00am and 7:45am, each handling a specific coordination function: task creation from project charters, daily work assignment and conflict flagging, variance reporting with earned value management, personalized reminders for overdue and blocked work, and data quality scanning. All agent output posts to Slack for human review. Sofia Reyes supervises the team as Human Orchestrator, monitoring by exception rather than approving every action.

Stakeholders

Stakeholder Relationship to Agent Team
Sofia Reyes (VP of Operations) Human Orchestrator — supervises all agents, reviews output daily, intervenes on exceptions, owns client relationships
Jesse Flores Strategic oversight — sets policy, reviews weekly reports, makes capacity allocation decisions based on agent-generated data
Lucas (Creative Lead) Receives task assignments from Coordination Agent; reviews AI-generated visual assets for brand consistency
Delivery team members Receive daily Slack reminders and task assignments; interact with agent output without needing to know it’s agent-generated
Clients Indirect — receive faster status updates and more consistent project delivery; never interact with agents directly

Systems Involved

System Role in Agent Team Access Method
CRM (EspoCRM-based) System of record for all projects, tasks, contacts, and cases REST API with scoped API keys per agent
Kimai (time tracking) Source of actual hours logged; feeds earned value calculations API integration
Slack Output channel for all agent summaries, reminders, and flags Slack API with bot token
Claude API AI reasoning for task generation, estimation, assignment, and reporting Direct HTTPS calls; Sonnet for Task Agent, Opus for Coordination Agent
Google Drive / Canva Storage for visual assets and brand materials Connected via existing integrations
Node.js agent server Runtime environment for all agents; handles scheduling, API calls, error recovery Self-hosted; each agent runs as a standalone service

Data Requirements

Data Source Freshness Required Access
Project charters and milestone definitions CRM Real-time (API poll) Task Agent reads at 7:15am daily
Open tasks with status, assignee, and estimates CRM Real-time Coordination Agent reads at 7:45am daily
Team skills matrix and capacity data CRM Updated weekly by Sofia Coordination Agent reads for assignment logic
Hours logged per task Kimai Previous day’s entries Reporting Agent reads at 7:30am daily
Task and project descriptions CRM Real-time Garbage Collector reads at 7:00am daily
Overdue and blocked task flags CRM Real-time Reminder Agent reads at 7:00am daily

Success Criteria

Criterion Measurement Target
Cost reduction Annual spend on coordination work From $24K/year to under $8K/year
Report timeliness Daily reports posted to Slack by 8:00am 95% on-time delivery
Task creation accuracy Percentage of agent-generated tasks Sofia approves without edits Above 80% within first month
Assignment accuracy Percentage of assignments Sofia does not override Above 85% within first month
Data quality improvement Percentage of active tasks with complete descriptions and estimates Increase from baseline within 60 days
Team satisfaction Qualitative feedback from delivery team on task clarity and communication speed Positive or neutral — no degradation from previous state

Constraints and Guardrails

Agents operate on fixed schedules only — no event-driven triggers in v1. All agent output posts to Slack before reaching the team. Agents cannot modify existing data unless explicitly designed to do so (Task Agent creates only; Garbage Collector flags only). No agent communicates with clients. Sofia reviews all output daily. API token spend is monitored; any single-day spend exceeding $50 triggers an alert. If an agent enters an infinite loop, exhausts memory, or generates runaway API calls, it is killed and does not restart until Sofia or Jesse investigates.

Autonomy level: Automated (human-on-the-loop). Sofia monitors by exception.

Three ways a human can relate to an agent team’s work:

  • Human-in-the-loop. The human approves every action before it executes. Nothing ships without a human sign-off. This is AI-assisted mode — the agent drafts, the human decides.
  • Human-on-the-loop. The agents run autonomously on schedule. The human reviews output and intervenes when something is wrong. This is monitoring by exception — you’re not approving every action, you’re catching the ones that go sideways.
  • Human-over-the-loop. The human sets policy and strategy. Agents execute within those boundaries. The human reviews aggregate results periodically — weekly reports, monthly trends — not individual outputs.

We started the PM Agent Team at human-on-the-loop. Sofia reviews agent output daily, intervenes when something looks off, and trusts the system to handle the routine correctly. She doesn’t approve every task assignment or every report — she reads the Slack summaries and acts on exceptions.

Human Orchestrator: Sofia Reyes. She built it, she runs it, she’s accountable for it.

Artifact: Design Gate Checklist

Gate Item Status Notes
Constraint validated with numbers Yes Cash flow impact from delayed project delivery + $24K/yr subcontractor cost documented
Knowledge Map complete Yes Sources mapped across digital and organic; Completeness Test passed
Every accountability assigned (human or agent) Yes Work Deconstruction table complete; all items assigned
Human Orchestrator named Yes Sofia Reyes
Autonomy level set Yes Automated (human-on-the-loop) for all agents
Guardrails defined Yes Agents run on schedule, post to Slack for review; Sofia checks daily
Escalation path documented Yes Agents flag conflicts and overdue items; Sofia triages
Data access scoped Yes CRM API, time tracking system, Slack, code repositories

Build: What Got Built

The build happened in two phases, and the first one failed.

Phase 1: n8n (September 2025 - February 2026). We started with n8n — a workflow automation tool — running scheduled workflows that fetched API data and fed it to AI for analysis. In the first design session, I was still designing around n8n. I drew the architecture on the whiteboard — a CRM workflow triggering an n8n webhook, an AI agent in n8n with tools for searching tasks and making API calls back to the CRM.

It was brittle. The webhook approach worked for simple triggers but fell apart when we needed agents to chain decisions, maintain context across multiple API calls, and handle the kind of error recovery that production systems demand. We couldn’t get it reliable enough to trust.

Phase 2: Agent Server (late February - March 2026). We abandoned n8n and built a dedicated agent server. Sofia drove the build.

Here’s something worth pausing on: Sofia isn’t a developer. She’s an operations leader who learned how to use Claude Code and how agent teams work — enough to build this system. She had me to help, but she didn’t need much. Once she learned the tools and had some pre-configured skills, her domain expertise let her build the suite better than I could have. She knew the CRM inside and out, understood every edge case in the coordination workflow, and could spec the agents’ behavior from lived experience rather than documentation. The designs and even builds can happen with people who aren’t traditionally technical — if they’re willing to learn the tools and they have deep knowledge of the domain.

The main agent suite consisted of five agents, each running as a standalone Node.js service with built-in cron scheduling:

Agent Schedule What It Does
Task Agent 7:15am Mon-Fri Polls for projects needing tasks, gathers context from the CRM, calls Claude to generate task descriptions with SMART outcomes, point estimates, and dependency ordering. Writes everything back to the CRM via API.
Coordination Agent 7:45am Mon-Fri Gathers all open tasks, uses Claude to estimate unestimated tasks and assign unassigned ones based on team skills and capacity. Posts flags and summary to Slack. Creates a daily review task for Sofia.
Reporting Agent 7:30am Mon-Fri Generates daily reports (completed tasks, overdue items, open cases) and weekly reports (project cost variance, schedule variance, team productivity, case responsiveness). Uses earned value management: EV = estimated points x completion %, AC = hours logged from time tracking.
Reminder Agent 7:00am Mon-Fri Sends personalized Slack DMs to each team member listing overdue and blocked tasks. Flags completed tasks missing point estimates or evidence of completion. No AI needed — purely data-driven.
Garbage Collector 7:00am (via n8n) Scans for data quality issues — poor task descriptions, incomplete project definitions.

The architecture is straightforward: native Node.js modules, no heavy dependencies. CRM API auth via API key header. Claude API via direct HTTPS. Each agent has three run modes: HTTP server with cron (default for production), CLI one-shot, and CLI with arguments for testing.

We also built a complementary scheduled agent that operates as a first-class user inside the CRM. It has its own user account, API key, and role. It handles task types like reviews, reports, digests, and emails — picking up assigned tasks, gathering context, calling Claude, and writing structured output back. It can also create follow-up tasks automatically, with traceability lines injected into each description.

By early March, the shift was visible. Sofia had been doing project charters manually — “I used to do the project charters. Before I started, there were no project charters in projects. It was like a brief description and you had to kind of guess what you had to do. So I started putting together project charters, but I was doing them manually. It took me a long time.” I created an agent for it. That was the first time I ever sat down and documented a step-by-step process for turning a human workflow into an agent workflow.

Sofia was cautious about autonomy, and that was the right call: “I’m taking it step by step. I won’t give it full autonomy right now. I will test it for a few months. We’ll see how it goes.”

By mid-March, the agents were connected to the CRM and handling real work. I described the state of play in a team conversation: “Sofia is building a team of agents to help us with project task and capacity management. A lot of the things that Sofia spends time on and that Rachel was spending time on are things we’ve realized — okay, if we chain together AI agents, we can get most of that work done by the AI agents. And so we can spend our time on things that are more valuable, like actually communicating with customers, new products, all that kind of stuff.”

I also laid out the governance model in that same conversation: “The human responsibility at the moment should be that the team lead sends this email, AI reviews, creates tasks, associates to the project. The next person gets it in their queue tomorrow and is able to complete it. In that case, the human in the loop wasn’t even you, it was the team member. Bypassing a lot of that delegation in the first place.” The long-term vision: “Once we feel like the agent system prompt is working the way it’s supposed to, the team is working, the agent team is working the way it’s supposed to, then you look at it less and less and less, and all of that time that we spend on this starts to go away.”

Artifact: Build Spec

Section Detail
Solution name PM Agent Team
Constraint addressed $24K/yr subcontractor cost on rule-based coordination work + performance issues + billing delays from slow project delivery
Systems connected CRM (EspoCRM-based) via API, time tracking system, Slack (notifications + DMs), code repositories, Claude API (Sonnet for Task Agent, Opus for Coordination Agent)
Tools / platforms Node.js agent server (five agents), scheduled CRM agent (operates as first-class CRM user), n8n (Garbage Collector only)
Agent capabilities Task generation + estimation, daily work assignment + conflict flagging, daily/weekly variance reporting, personalized reminders, data quality scanning
Access / permissions Each agent has its own CRM user with scoped API key and role-based permissions
Guardrails Agents run on fixed schedules (7:00-7:45am Mon-Fri), post all output to Slack for review, create review tasks for Sofia, excluded users list prevents agents from touching Sofia’s or Jesse’s work
Human Orchestrator Sofia Reyes

Deliver: Shipping It

Rachel left right before the agent team went live. There was no parallel transition period — no window where Rachel and the agents ran side by side. Rachel was gone, and the agents picked up where she left off.

Only Sofia was trained on supervising the agent team. That’s because she built it. She shared the system with me and documented everything in our knowledge base. There was no broader training needed — the rest of the team interacted with the agents’ output (Slack messages, CRM tasks, daily reports) without needing to know or care that an agent generated them.

The team didn’t notice the difference. What they noticed was that task management got better and faster. Delegation happened automatically. Status updates arrived on time. The work that used to sit waiting for Rachel to get to it just happened.

We tracked the same metrics we’d always tracked: cost variance, time variance, schedule variance. The Reporting Agent automated what Rachel had been doing manually with scorecards — and did it with actual earned value management math instead of self-reported numbers.

The biggest early adjustment was the n8n-to-agent-server migration itself. We tried n8n first. It was brittle. The webhook architecture worked for simple triggers but couldn’t handle the chaining, context management, and error recovery that production agent work demands. The decision to rebuild as a dedicated agent server — each agent as a standalone Node.js service with its own cron, its own system prompt, its own CRM user — was the build-phase pivot that made everything else possible.

The cost comparison tells the story. Agent tooling runs about $500/month — roughly $200 for server hosting and $300 for Claude API tokens. That’s about $6K/year. Rachel’s role cost $24K/year. A 75% cost reduction, and the agents don’t call in sick or deteriorate over time.

Artifact: Delivery Metrics

Metric Before (with Rachel) After (PM Agent Team) Delta
Coordinator cost $24K/year (subcontractor) ~$6K/year (~$500/mo for server + Claude API) 75% cost reduction
Report generation Manual, self-reported, weekly Automated daily + weekly, earned value math From lagging self-reports to daily automated variance tracking
Task creation Manual — Rachel read charters and transcripts Automated — Task Agent generates from charters with SMART outcomes From hours of manual work to minutes of agent processing
Work assignment Manual — Rachel/Sofia assigned tasks by hand Automated — Coordination Agent assigns daily based on skills/capacity From ad-hoc delegation to systematic daily assignment
Team notification Manual — Rachel/Sofia messaged individuals Automated — Reminder Agent sends personalized Slack DMs From inconsistent follow-up to daily personalized reminders
Data quality No systematic review Automated — Garbage Collector flags poor descriptions From no QA to continuous automated scanning
Communication speed Delayed — waited for Rachel to relay priorities Immediate — agents post directly to Slack each morning Team knows what’s past due and what’s priority before standup

Compound: What the Next Sprint Inherits

The biggest learning from this Compound Sprint was that we could move even more work to agentic teams than we initially expected. When we started, we assumed things like work assignment and capacity management would stay human for a long time — they felt too nuanced for agents. Sofia proved that wrong. She figured out how to make those automatable sooner than either of us predicted. She’s now focused on exceptions rather than coordination — which is exactly where a Human Orchestrator should be spending her time.

The scope expanded after initial deployment, and it continues to expand as agents become more powerful and Sofia gets better at identifying new signals worth solving. Every week she finds another coordination task that follows a pattern the agents can learn. The boundary between “requires human judgment” and “agents can handle this” keeps moving — not because the agents are getting smarter (though they are), but because Sofia is getting better at decomposing the work.

Work assignment and capacity management were the surprise. We’d marked those as “stays human” in the original Work Deconstruction. Sofia had them automated within weeks. The Coordination Agent turned out to be good enough at matching skills to tasks and flagging capacity conflicts that Sofia could shift to reviewing its decisions rather than making them herself.

The PM Agent Team sprint was one of several sprints that contributed to our headcount reduction from 13 to 8. We lost the coordinator role entirely. Sofia got significant time back — hours each week that had been consumed by coordination and reporting now handled by agents. Communication flowed faster to the team: what was past due, what was priority, what needed attention today. That information used to pass through a human bottleneck. Now it flows directly.

Email triage emerged as the next coordination challenge — how to categorize and route incoming email so agents could handle more of the coordination automatically. That became the next Compound Sprint. That became the next Compound Sprint.

Artifact: Sprint Retrospective

Category Detail
What worked The information factory framing — mapping inputs, transformations, and outputs for each piece of work made it clear which pieces were automatable. Sofia building the system herself meant the Human Orchestrator understood it deeply from day one. The Roles/Tasks deconstruction in Source prevented us from designing against a job title instead of the actual work.
What didn’t work n8n as the initial platform. Too brittle for production agent work. The pivot to a dedicated agent server cost time but was necessary.
What changes for next sprint Agent teams can absorb more coordination work than we assumed. Start future sprints with higher ambition for what goes to agents, and let the Design Gate pull it back if the judgment requirements are real.
New constraints surfaced Email triage emerged as the next coordination challenge — categorizing and routing incoming email so agents could handle more coordination automatically.
Knowledge captured Sofia documented the agent system in the knowledge base. Rachel’s organic knowledge (client relationship context, unwritten rules per account) was partially lost — this is the cost of not running Source before the person leaves.
HAC update PM Agent Team row confirmed in Hybrid Accountability Chart. Sofia as Human Orchestrator, five agents handling task creation, coordination, reporting, reminders, and data quality.