Multi-model orchestration in practice: a full walkthrough

Every model has a ceiling. Gemini will map a problem space thoroughly but hedge when you ask it to decide. Claude Sonnet makes sharp decisions and writes clean plans but gives you a narrower picture when you ask it to research cold. Gemini Flash implements quickly and doesn’t push back on a spec. Haiku verifies accurately in a fresh context for a fraction of the cost.

The cast

Model	Role	Strength
Gemini 3	Research	Web search, broad option mapping, genuine tradeoffs
Claude Sonnet	Decisions & planning	Structure, requirements, architectural judgment
Gemini 2.5 Flash / GPT-OSS-120B / Haiku	Bulk implementation	Cheap generation from a spec — pick based on available credits and tooling
Claude Haiku	Verification	Cheap, bounded tasks, fresh context is an advantage

The workflow

Gemini 3 (research options)
  → Claude Sonnet (decide on approach + write requirements)
    → Gemini 3 (write implementation plan)
      → Claude Sonnet (validate + finalize plan)
        → cheap subagents (implement — Gemini Flash, Haiku, or Aider)
          → Claude Haiku (verify against plan)
            → Claude Sonnet (fix issues + sign off)
→ loop

Step 1: Gemini researches the space

Start in Gemini. Give it the idea and ask it to map the options — not design the solution.

“I want to build [rough description]. Research available libraries/approaches for [specific technical decisions]. For each, document realistic options with tradeoffs. Don’t make recommendations yet, just map the space. Output all research as markdown files in a docs/ folder.”

The “don’t make recommendations” framing matters. The moment you ask Gemini to decide, you get hedged, noncommittal output. You want the map, not its conclusion.

Step 2: Claude Sonnet decides and writes a requirements document

Bring Gemini’s research into Claude Code. Ask Sonnet to make the technology decisions and produce a requirements document — not an implementation plan yet, just a clear definition of what the software needs to do.

“Review Gemini’s project research in docs/. Based on this, decide on the technology approach, then write a requirements document covering: what the software does, user requirements, and acceptance criteria for each feature. Focus on the what, not the how. Save it to docs/REQUIREMENTS.md.”

This is where Claude earns its place. Hand it a well-researched map and it navigates decisively. The requirements document becomes the contract that everything downstream gets measured against.

Step 3: Gemini writes the implementation plan

Give the requirements document back to Gemini and ask it to write an implementation plan.

“Review the requirements document in docs/REQUIREMENTS.md. Write a high-level implementation plan — phases, components, and what gets built in what order to satisfy these requirements. Save the plan to docs/PLAN.md.”

This is a good fit for Gemini — it’s a generation task with clear constraints. The technology decisions are already made, the requirements are defined, and Gemini’s job is to map out how to build it.

Step 4: Claude Sonnet validates and finalizes the plan

Send the plan back to Claude Sonnet to pressure-test it.

“Review the requirements document in docs/REQUIREMENTS.md and the implementation plan in docs/PLAN.md. Validate the plan against the requirements, fill in missing detail, and flag anything that would cause problems mid-implementation. Update docs/PLAN.md with the finalized version.”

It’s a short pass but it catches things that save real pain later. Sonnet works through the plan, tightens the detail, and updates docs/PLAN.md in place. This becomes the finalized spec — everything Gemini Flash implements and everything Haiku verifies gets measured against it.

Step 5: Implement with cheap subagents

This is the most flexible step — the goal is to get the bulk of the code written as cheaply as possible, with Claude Sonnet orchestrating. There are a few ways to do it depending on what you have available:

Option A: Gemini Flash subagents — Ask Gemini Flash to implement the plan using its own subagents. Its large context window means you can include the full plan plus relevant existing code. Good if you have Gemini credits to spare.

“Review the finalized implementation plan in docs/PLAN.md and implement it. Use subagents where tasks can be parallelized.”

Option B: Haiku subagents via Claude Code — Ask Sonnet to dispatch parallel Haiku subagents directly inside Claude Code. Keeps everything in one tool, and Haiku at $1/$5 per million is significantly cheaper than Sonnet for generation tasks.

“Spawn parallel Haiku subagents to implement the plan in docs/PLAN.md. Assign each subagent a discrete task. Each should read the relevant files, implement its task, and report back.”

Option C: Aider via bash tool calls — Ask Sonnet to orchestrate Aider instances by invoking them through bash tool calls. Each call launches an Aider session pointed at a specific task from the plan. I use GPT-OSS-120B via OpenRouter here ($0.039/M input, $0.19/M output) — it’s the cheapest option for bulk generation, and it keeps implementation entirely outside my Claude session budget. I save my daily Gemini credits for the planning steps where web search actually matters.

“Use bash tool calls to launch parallel Aider instances to implement the plan in docs/PLAN.md. Assign each instance a discrete task. Use GPT-OSS-120B via OpenRouter as the model.”

All three options converge on the same thing: a cheap model following a detailed spec, with Sonnet not writing a line of implementation code.

Step 6: Haiku verifies

Back in Claude Code, ask Sonnet to spawn parallel Haiku subagents to verify the implementation.

“Spawn parallel Haiku subagents as needed to verify the implementation against the finalized plan in docs/PLAN.md. Check that everything in the plan was implemented correctly and flag any gaps or deviations.”

Haiku reads the files cold, checks against the plan, and reports back. The benefit isn’t just cost — Haiku starts fresh, reading what’s actually in the code rather than what a long session might have led you to assume. Fresh context is an advantage for verification.

Step 7: Sonnet signs off

“Haiku found [issue] when verifying against docs/PLAN.md. Fix it, then do a final check — anything architecturally concerning before we call this done?”

Sonnet fixes what Haiku flagged, runs a final check, and signs off. Keep this step short — its job is confirmation, not discovery.

Iterating

The workflow above gets you from idea to working prototype. For subsequent iterations you don’t need to start from scratch — the research and requirements are already done.

Pick up from Step 3: ask Gemini to write a plan for the new feature or fix, then run the rest of the loop from there. Sonnet validates, Gemini Flash implements, Haiku verifies. The requirements document stays as the north star; you’re extending docs/PLAN.md, not rewriting it.

Each iteration is its own loop. The plan gets more detailed over time and the implementation gets more predictable.

Why it works

Each model is doing the work it’s actually good at — and none of them are doing work they’re bad at. Gemini never has to make a hard call. Claude never has to cold-research a library landscape. Gemini Flash never has to make architectural decisions. Haiku never has to hold a full session’s worth of context.

The two companion posts in this series go deeper on specific pieces: why Gemini and Claude think differently at the research stage, and how the token cost structure makes this workflow significantly cheaper than running everything through Sonnet.

I’ve implemented this workflow as a Claude Code skill — view it on GitHub.

Have questions or want to share your own patterns? Find me on GitHub or LinkedIn.