How to Cut Your Claude Code Costs in Half

Claude Code is the most powerful AI coding assistant on the planet. It's also one of the most expensive if you don't know what you're doing. We've analyzed thousands of Claude Code sessions and found that the average developer wastes 40-60% of their token budget on patterns that are easy to fix.

Here's exactly how to cut your costs in half without losing any productivity.

1. Use /fast for Mechanical Tasks

Claude Code's /fast mode uses the same Opus model but with faster output and lower cost. The catch: most developers never use it.

Here's the rule of thumb. If the task requires judgment or cross-file reasoning, stay on default Opus. If it's mechanical, use /fast.

Use /fast for:

• File reads and grep searches

• Git operations (status, diff, log, commit)

• Simple edits like renaming variables or fixing typos

• Formatting and linting fixes

• Running builds and tests

Stay on Opus for:

• Architecture decisions

• Complex debugging across multiple files

• Security-sensitive changes

• Plan mode and code review

We've measured this across real sessions. Teams that adopt /fast for 40% of their tasks save an average of $128/month per developer.

2. Restart Sessions at Message 20

Every message in a Claude Code session re-reads the entire conversation history. Message 1 costs X tokens. Message 20 costs 20X tokens. Message 40 costs 40X tokens.

The compounding is brutal. We found sessions that exceeded 40 messages where the last few messages cost more than the first 20 combined.

The fix: Start a new session when you hit roughly 20 messages. Claude Code's context compression helps, but a fresh session is always cheaper. Use compact mode (/compact) to summarize progress, then start fresh.

You can track your session lengths with the PromptReports CLI:

npx @promptreports/cli --sessions

This shows your average session length, cost per session, and identifies sessions that ran too long.

3. Trim Your CLAUDE.md

Your CLAUDE.md file loads on every single message. If it's 4,000 words, that's 4,000 words of tokens charged on every turn of every session.

Most CLAUDE.md files we've audited contain instructions that should be in Skills files instead. Skills only load when invoked. Instructions that apply to every session belong in CLAUDE.md. Instructions for specific tasks belong in Skills.

Before (4,200 words in CLAUDE.md):

• Project overview (keep)

• Tech stack summary (keep)

• Coding conventions (keep)

• 15 detailed workflow guides (move to Skills)

• 8 debugging procedures (move to Skills)

• Template boilerplate (move to Skills)

After (1,800 words in CLAUDE.md):

Same quality. Same conventions. But every message costs 57% fewer tokens on the system prompt.

Estimated savings: $42/month per developer.

4. Use Subagents for Research

When Claude Code needs to explore your codebase, it can spawn subagents that work in isolated context windows. This keeps your main conversation context clean and prevents the token snowball effect.

Instead of asking Claude Code to "find all the places where we handle authentication" in your main session (which adds hundreds of lines to your context), tell it to use a subagent:

"Use a subagent to find all authentication handling in this codebase and report back a summary."

The subagent does the heavy lifting in its own context window. Only the summary comes back to your main session.

5. Set Up the PromptReports CLI

The fastest way to see where your Claude Code budget goes:

npx @promptreports/cli

This scans your local Claude Code sessions and gives you a complete breakdown:

• Cost per session, per day, per week

• Token efficiency scores (how much of your spend was on cache hits vs. new tokens)

• Session length distribution

• Cost per git commit (which features were expensive to build)

• Specific quick wins with dollar amounts

No account needed for the basic scan. Create a free account at promptreports.ai to push your data to the dashboard and get AI-powered optimization recommendations.

What's Next

These five changes typically save $200-300/month per developer. That's before you start optimizing your other AI providers, infrastructure costs, and model selection.

The PromptReports Ops Intelligence Dashboard tracks all of this automatically. One terminal command. Every service. Every cost. Every optimization.

npx @promptreports/cli --all --push

Start free at promptreports.ai.