css-development
v1.0.0CSS development workflows with Tailwind composition, semantic naming, and dark mode by default
Token-efficient code generation pipeline - parallel implementation with hosted LLM (Cerebras) for ~60% token savings. Includes MCP server.
/plugin install speed-run@2389-research
Full plugin documentation and usage guide
Token-efficient code generation pipeline. Uses hosted LLM (Cerebras) for fast, cheap first-pass generation with Claude handling architecture and surgical fixes.
| Skill | Description | Best For |
|---|---|---|
speed-run:turbo | Direct hosted codegen | Single task, algorithmic code, boilerplate |
speed-run:showdown | Same design, parallel runners compete | Medium-high complexity, want best implementation |
speed-run:any-percent | Different approaches explored in parallel | Unsure of architecture, want to compare designs |
/plugin install speed-run@2389-research
Speed-run requires a Cerebras API key for hosted code generation. Free tier includes ~1M tokens/day.
~/.claude/settings.json:{
"env": {
"CEREBRAS_API_KEY": "your-key-here"
}
}
User: "speed-run" / "turbo build" / "fast build"
↓
Check: Cerebras API key
↓
┌─────────────────────────────────────────┐
│ Route Selection │
│ │
│ 1. Turbo - Direct codegen │
│ 2. Showdown - Parallel competition │
│ 3. Any% - Parallel exploration │
└─────────────────────────────────────────┘
User: "Use speed-run to build a rate limiter"
Claude writes a contract prompt:
- DATA CONTRACT (exact models, types)
- API CONTRACT (exact routes, responses)
- ALGORITHM (step-by-step logic)
- RULES (framework, storage, error handling)
Cerebras generates code → written to disk (~0.5s)
Claude runs tests → surgical fixes if needed (1-4 lines)
The contract prompt pattern is like speccing a ticket for a junior dev — explicit inputs, outputs, types, and behavior. That specificity is what makes hosted LLMs reliable at 80-95% first-pass accuracy.
User: "Use showdown for the auth system"
Claude assesses complexity → spawns 3 runners
Each runner:
1. Reads the shared design doc
2. Creates their OWN implementation plan
3. Generates code via Cerebras
4. Runs tests, fixes failures
All runners dispatched in parallel.
Fresh-eyes review → judge scores all → winner selected.
Key insight: each runner creates their own plan from the design doc. No shared implementation plan means genuine variation emerges naturally.
User: "Not sure whether to use SQLite or Postgres, try both"
Claude generates 2-3 architectural approaches
Each variant:
1. Gets its own worktree and branch
2. Creates implementation plan for its approach
3. Generates code via Cerebras
4. Runs tests
Same scenario tests run against all variants.
Fresh-eyes review → judge scores all → winner selected.
| Scenario | Speed-run? |
|---|---|
| Algorithmic code, data transforms | Yes, turbo |
| Boilerplate, scaffolding | Yes, turbo |
| Comparing multiple implementations | Yes, showdown |
| Exploring different architectures | Yes, any-percent |
| Complex business logic that needs reasoning | No, use Claude directly |
| One-liner fixes | No, overkill |
Speed-run mirrors test-kitchen's parallel patterns but shifts code generation to a hosted LLM:
| Test Kitchen | Speed-Run | |
|---|---|---|
| Code generation | Claude writes everything | Cerebras generates, Claude fixes |
| Token cost | Standard | ~60-70% savings |
| Generation speed | ~10s per file | ~0.5s per file |
| First-pass quality | ~100% | 80-95% |
| External dependency | None | Cerebras API key |
| Model | Speed | Notes |
|---|---|---|
gpt-oss-120b | ~3000 t/s | Default — best value, clean output |
llama-3.3-70b | ~2100 t/s | Reliable fallback |
qwen-3-32b | ~2600 t/s | Has verbose <think> tags |
llama3.1-8b | ~2200 t/s | Cheapest, may need more fixes |
Speed-run orchestrates these skills (uses fallbacks if not installed):
superpowers:dispatching-parallel-agentssuperpowers:using-git-worktreessuperpowers:writing-planssuperpowers:executing-planssuperpowers:test-driven-developmentsuperpowers:verification-before-completionfresh-eyes-review:skillsscenario-testing:skillssuperpowers:finishing-a-development-branchSpeed-run was born from test-kitchen's token cost problem. Running 3-5 parallel Claude agents generates a lot of expensive output tokens. By shifting first-pass code generation to Cerebras (~3000 tokens/second), we keep the same parallel exploration patterns at a fraction of the cost — Claude focuses on what it's best at: architecture, orchestration, and surgical fixes.
Get started in seconds
/plugin marketplace add 2389-research/claude-plugins
/plugin install speed-run
Skills auto-trigger when relevant