/cs:cross-eval — Multi-Model Consensus

Command: /cs:cross-eval <memo-or-brief>

Runs the same memo through multiple model providers and reconciles divergences. Use for high-stakes, irreversible decisions where single-model bias is too costly: M&A, major fundraises, layoffs, strategic pivots, regulatory commitments.

Adapted from gstack's /codex cross-review pattern, generalized to business memos instead of code PRs.

When to Run

Before signing a term sheet
Before announcing a layoff
Before committing to a regulated market
Before any decision where reversing costs > 6 months of company time
When the boardroom vote was split or had a CRITICAL dissent

Models Used (graceful degradation)

The command tries to invoke each available model in order:

Claude (primary, always available) — the boardroom's native voice
Codex / OpenAI (if OPENAI_API_KEY or codex CLI available)
Gemini (if GEMINI_API_KEY or gemini CLI available)

If only Claude is available, the command runs Claude-only with adversarial mode — same model, different prompt seeds — and clearly labels the output as single-model.

Workflow

Read the memo / brief
Probe environment for available model CLIs / API keys
For each available model:
- Send the memo with this prompt prefix:
  
  "You are an independent C-suite reviewer. The following is a board memo from another company's boardroom. Identify the top 3 concerns, the top 3 supports, and your vote (APPROVE / REJECT / DEFER). Do not deferentially agree — assume the memo's reasoning is flawed until proven otherwise."
Collect three independent reviews
Reconcile: where do they agree? Where do they diverge?
Surface the divergences as questions for the founder

Output Format

Saved to ~/.claude/cross-eval/YYYY-MM-DD-<slug>.md:

# Cross-Eval: <memo title>
**Date:** YYYY-MM-DD
**Memo reviewed:** <link>
**Models invoked:** Claude / Codex / Gemini (or noted fallbacks)

## Vote Tally
| Model | Vote | Confidence |
|---|---|---|
| Claude | APPROVE | High |
| Codex | DEFER | Med |
| Gemini | APPROVE | Low |

## Consensus Concerns (≥2 models flagged)
1. <concern> — flagged by Claude + Codex
2. <concern> — flagged by all 3

## Divergent Concerns (1 model flagged)
- <Codex only:> <concern> — worth a second look
- <Gemini only:> <concern> — likely noise, but check

## Consensus Supports (≥2 models endorsed)
1. <support>
2. <support>

## Recommendation
- 🟢 GO if 2+ models APPROVE and no CRITICAL concerns from any model
- 🟡 PAUSE if any model is DEFER or any concern is CRITICAL
- 🔴 STOP if 2+ models REJECT

## Open Questions for Founder
1. <question raised by divergence>
2. <question raised by divergence>

Why This Matters

Single-model recommendations have systematic biases. Claude trends helpful and may under-weight risk. Codex (OpenAI) trends more cautious on emerging-market and regulatory topics. Gemini trends more cautious on technical scale claims. Disagreement is signal, not noise.

This is the safety net before irreversibility — not a replacement for outside counsel or a real board.

Graceful Degradation

If only Claude is available:

**Models available:** Claude only
**Mode:** ADVERSARIAL — running 3 independent Claude passes with different system prompts:
  1. Standard reviewer
  2. Devil's advocate (must find 3 critical concerns)
  3. Steelman (must find 3 strongest reasons to approve)

This is weaker than true multi-model. Treat the result as suggestive, not conclusive.

Routing

/cs:decide — if consensus is GO
/cs:freeze — if consensus is PAUSE
/cs:boardroom (re-run) — if consensus is STOP

Skills: board-meeting, executive-mentor
Inspiration: gstack's /codex cross-review pattern (adapted to business memos)

Version: 1.0.0

Cross Eval

/cs:cross-eval — Multi-Model Consensus

When to Run

Models Used (graceful degradation)

Workflow

Output Format

Why This Matters

Graceful Degradation

Routing

Related

Bundled with this artifact

More on the bench

Bash Pro

Security Compliance Compliance Check

Bats Testing Patterns