The fastest way to get mediocre output from an AI coding tool is to treat it like a chat box: open a blank prompt, describe a feature, and let the model re-read your whole codebase to figure out what you mean. I stopped doing that a while ago.
Now I run Claude Code as a configured system — a skill, a set of rules, and a subagent scoped to each feature, with two small index files tying it together. Here's the setup, and why I dropped the other tools around it.
The unit of work is the feature
Most people configure their AI assistant once, at the project level, and leave it there. I go one level down. Every feature gets its own skill (a SKILL.md), its own rules, and its own subagent.
The reason is boringly practical: tokens. When the work is scoped to a feature, Claude already knows what it's touching and why. It doesn't burn thousands of tokens re-exploring the codebase to rediscover something I could have just told it once. Less re-exploration, more building — and the behavior stays consistent every time I come back to that feature months later.
Skill, rules, subagent — who does what
I keep the three responsibilities separate on purpose:
- The skill carries the how — the conventions, patterns, and gotchas specific to that feature.
- The rules are the guardrails — what to never do, what to always check before finishing.
- The subagent does the work in its own context window, which keeps the main thread clean and focused.
Some subagents are feature-scoped. Others are role-based — a dedicated code-reviewer that only ever reviews, for example. Either way, the isolated context is the point: a subagent can dig deep without dragging all of that detail back into my main session.
CLAUDE.md and AGENTS.md are the index
None of that helps if the tool can't find it. So I keep two deliberately thin files at the root:
CLAUDE.mdholds a summary of the stack and points to the skills.AGENTS.mdpoints to the subagents.
They're a map, not a dumping ground. The real detail lives in the per-feature files; these two just tell the tool where to look. Keeping them small is part of the token discipline — a bloated context file is a tax you pay on every single request.
I prompt in XML — everywhere
Even with all that structure, the prompt still decides the outcome. I write prompts with XML-style tags: clear sections for context, the task, the constraints, and the output I expect. It removes any ambiguity about what's instruction versus content, and it's the single habit that's improved my results the most. I keep it even when I'm not in Claude Code.
Why I dropped Copilot
Inline autocomplete was genuinely useful when the AI's job was to finish my current line. But once the job became "own a feature end to end inside a configured system," Copilot was solving a smaller problem than the one I had. I turned it off and haven't missed it.
Codex is the overflow valve
Claude Code has usage limits; deadlines don't care. When I hit a wall, I move to Codex to keep moving — same discipline, same XML-structured prompts. It isn't a replacement, it's the spare tank: there for the moment I need it, not the center of the workflow.
The mental shift
The change that mattered wasn't a new tool — it was a new frame. Stop treating the AI like a chat box and start treating it like a system you configure. Scope the work to a feature, write down the rules, point the tool at them with a thin index, and prompt with structure. You get more consistent output, fewer wrong turns, and — if you care about cost — far less token waste.
It's the same discipline I bring to client work: backend systems built to last and automation that survives production, not demos that fall over the first time reality shows up.
Building something and want that kind of rigor on it? Start a project brief and tell me what you have in mind.