Claude Skills, Connectors, and Not Reaching for the Biggest Model Every Time

My actual Claude Code setup - a plugin that talks like a caveman, subagents on cheap models, and the day it read my design file

Every post about AI coding tools says the same three things: write better prompts, give it context, review the output. Sure. This isn't that post. I've been running Claude Code as my daily driver at work for a few months now - fintech backend, real deadlines, a token budget I actually feel - and the things that moved the needle for me are things I rarely see anyone write about. So here's my real setup, boring details included.

Skills: the instruction you never type again

Somewhere in week two I noticed I was typing the same commit-message instructions for the third time. Same with PR reviews - every session started with me re-explaining what our team actually checks for. A skill is just that instruction saved as a markdown file and invoked by name. Nothing clever about it, which took me embarrassingly long to accept. Here's one I use daily:

---
name: commit-message
description: Write a Conventional Commits message from the staged diff, no fluff
---

Look at `git diff --staged`. Summarize the change in one sentence,
leading with why, not what. Subject line under 50 chars. Only add a
body if the reasoning isn't obvious from the diff itself.

My rule now: the second time I catch myself typing the same instruction, it becomes a skill. Not the fifth time. The second.

The plugin that talks like a caveman

This one sounds like a joke and saved me the most money. I hit my usage limit two days into a week once, looked at where the tokens went, and a huge slice was the model being polite at me. "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..." - none of that is information. So now I run a compression plugin that forces a stripped-down output style. Same technical content, exact error messages, code untouched - just no filler.

text

Before: "Sure! I'd be happy to help. The issue you're
         experiencing is likely caused by the token expiry
         check in your auth middleware..."

After:  "Bug in auth middleware. Token expiry check uses <
         not <=. Fix:"

It reads ridiculous for the first hour. Then you stop noticing, because you were always scanning for the technical content anyway - the fluff was for nobody. The plugin ships a stats command that reads actual session logs, and on chatty debugging sessions the savings are large enough that I'd rather not admit how much I was paying for pleasantries before.

Small models for small jobs

The habit I had to unlearn: opening everything with the biggest model available, out of a vague feeling that more capability is always safer. It's not - it's slower and pricier for exactly the tasks that don't need it. A rename doesn't need Fable-tier reasoning. It needs to be done in four seconds.

The piece that made this practical for me is subagents. My setup has a read-only "investigator" agent that only answers "where is X defined, what calls Y" - it returns a file:line table and refuses to do anything else. It runs on the cheapest tier, burns its own context window instead of mine, and hands back a compressed summary. A "builder" agent does bounded one-or-two-file edits, same idea. The big model is reserved for the work that's actually ambiguous.

json

{
  "agents": {
    "investigator": { "model": "haiku",  "tools": ["Read", "Grep", "Glob"] },
    "builder":      { "model": "sonnet", "tools": ["Read", "Edit", "Write"] },
    "architect":    { "model": "opus" }
  }
}

When I can't tell which tier a task belongs to, I've started treating that as a smell that the task itself is underspecified - not as a reason to grab the big model and hope. If I want a second opinion, this is the prompt I run the task description through:

text

Classify this task into exactly one tier:

SMALL  - single-file mechanical edit: rename, typo, formatting
MEDIUM - normal feature work, a bug fix, a routine PR review
LARGE  - architecture decision, ambiguous requirements, or a task
         where being wrong is expensive to undo

Task: "{{task_description}}"

Respond with only the tier name.

Connectors: the day it read my design file

I redesigned this blog recently. Old me would have made mockups, screenshotted them, pasted the screenshots into a chat, and typed paragraphs describing spacing and hex codes. Instead: I did the design in Claude on the web, and Claude Code pulled the actual design files over an MCP connector, read the markup, and implemented the theme in my Next.js repo. Then - this is the part that got me - it opened the pages in my actual browser through the Chrome connector, took screenshots, compared them against the design, and fixed what was off. I reviewed diffs and screenshots. That's it.

The lesson generalizes past design files. Every connector removes one place where I was the manual bridge between where data lives and what I'm asking the model to do with it. Gmail, Drive, an MCP server for your internal tools - same shape. The model didn't get smarter. It stopped needing me as a human clipboard.

If you try one thing this week

Turn your most-retyped instruction into a skill. Takes ten minutes, pays back the same day.
Check where your tokens actually go before paying for a bigger plan. Mine went to pleasantries and verbose tool output, not reasoning.
Move one mechanical, bounded job to a small model and see if you can tell the difference. I couldn't.
Wire one connector to a place your data already lives, and stop screenshotting things at the model.

None of this needed the model to get better. It needed me to treat the tooling around the model as part of my job - the same way I'd treat a build pipeline or a linter. That shift, more than any release announcement, is what changed how much I get out of it.