What a 6-month AI roadmap looks like.
Six stages, one quarter of build, one quarter of compounding, real numbers on cost and savings at each step. This is the actual shape of an Orbit-built roadmap, not a generic 'crawl-walk-run' framework.
Why a roadmap, not a project
A single 'AI project' rarely survives contact with a real organization. The team scopes a thing, builds it, and discovers month one of usage that the actual win is somewhere upstream or downstream. A roadmap forces sequencing instead of one-shot betting. The first stage is cheap, learn-fast. Each later stage compounds on what already shipped.
The six stages
Here is the actual shape we ship for a mid-stage SaaS company. Months are typical, not contractual.
Cost shape over time
Cost goes up before it goes down. Here is what we typically see at month 6 across the three Anthropic tiers, for a workload running ~1000 production requests per day with ~2000 cached input tokens and ~500 output tokens. Caching is on.
We default to prompt caching by month 3 and to Sonnet 4.6 for high-volume worker tasks. Opus 4.7 stays in production for the reasoning-heavy paths where quality compounds. Haiku 4.5 takes the classification and routing layer.
Savings shape over time
The cost above is what you spend. The savings below is what you stop spending elsewhere. Mid-roadmap, this is mostly engineer time reclaimed: less manual data plumbing, fewer late-night dashboard rebuilds, fewer 'can someone figure out why this report is wrong' tickets. Plug your team's numbers:
For most mid-stage SaaS teams, by month 6 the engineer-hour savings exceeds the monthly token spend by an order of magnitude. The token bill is real, but it is rarely the lever that matters most.
Where teams typically slip
- Scoping the second pilot before the first stabilizes. Skipping eval rigor on pilot one to ship pilot two faster is the single most common regret. Slow down month 2 to speed up month 3.
- Skipping the gateway. Without a centralized gateway by month 4, every team builds their own auth, redaction, and logging. Month 5 turns into platform debt cleanup, not new workflow shipping.
- Underestimating the handoff. The internal team needs to own operations by month 6, not just consume them. We pair-build the runbook and sit second-chair for two weeks, not zero.
- Forgetting the eval upgrade path. When Anthropic ships the next model, you want a green check or a red X within an hour, not 'let us spend two days re-prompting.' That requires the eval suite to be a living thing, not a one-time validation.
Want to see the cost across cost forecasting, embedding cost, and latency in one place? Open the Cost Lab and the Performance Lab.