Research Operations — Domain Orchestrator

The Research Operations surface is how the enterprise plans, funds, scopes, and synthesizes research across four workstreams: clinical R&D, R&D finance, market research, and product research. This orchestrator forks its context, routes your inquiry to one of four sub-skills, then returns a digest. Heavy intake (protocol drafts, program ledgers, survey exports, interview transcripts) stays in the forked context.

This is the enterprise counterpart to the academic research/ domain. If your question is about finding literature, grants, or patents, use research/. If it is about planning, funding, scoping, or synthesizing research as an operational discipline, you are in the right place.

When to invoke

Symptom	Sub-skill
"We're designing a Phase 2 trial — what's the endpoint and sample size?"	`clinical-research`
"What's our R&D program burn, and is this cost CapEx or OpEx?"	`research-finance`
"What's the TAM for this product, and how do we survey the segment?"	`market-research`
"How many users do we interview, and how do we synthesize the findings?"	`product-research`

Routing logic (deterministic)

Same two-signal threshold pattern as commercial-skills. Single-signal → clarifying question. Mixed signals → highest-confidence first, chain second in a follow-up turn. Never silently chain.

Signal table

Signal class	Keywords	Sub-skill
CLINICAL	clinical trial, study design, protocol, endpoint, sample size, power, phase 1/2/3, biostatistics, eligibility, feasibility, estimand	`clinical-research`
RD_FINANCE	R&D budget, program budget, burn, runway, F&A, indirect rate, overhead, capitalize vs expense, R&D capex, portfolio ROI, rNPV	`research-finance`
MARKET	TAM, SAM, SOM, market sizing, survey design, sampling, margin of error, segmentation, competitive intelligence, market research	`market-research`
PRODUCT	user interview, JTBD, usability test, concept test, prototype test, discovery research, research repository, insight synthesis, saturation	`product-research`

Workflow (Matt Pocock grill discipline)

Derived from Matt Pocock's grill-with-docs pattern: explore-then-ask, one question per turn with a recommended answer, walk the decision tree depth-first, track dependencies, anchor every challenge in the research canon (references/ of each sub-skill).

Step 1 — Explore before asking

Check the user's working directory first:

Is there a protocol draft, program ledger, TAM model, or interview guide already in the workspace?
Does the inquiry already disambiguate the lane (e.g., "what sample size for a two-arm trial" — that's clinical-research, no question needed)?
Is there an artifact filename that resolves the lane (protocol.json → clinical; program-budget.json → finance; tam-model.json → market; interview-guide.md → product)?

If the workspace resolves the lane, route silently.

Step 2 — If still ambiguous, ONE forcing question with a recommended answer

Matt's rule: never bundle. Always recommend.

Pattern:

Q1/1: [precise question naming the two candidate lanes]
Recommended: [Lane X, because <signal-table rationale>]

(Confirm, or override?)

Step 3 — Decision-tree walk for multi-lane inquiries

If the inquiry legitimately crosses two lanes (e.g., "design this trial AND budget it" = CLINICAL + RD_FINANCE), walk depth-first:

Highest-confidence lane first → run sub-skill in forked context → digest
Ask: "Now run [second lane]? Recommended: yes, because [dependency]."
Confirm before chaining.

Never silently chain.

Step 4 — Invoke sub-skill in forked context

Forward original prompt + structured inputs (protocol JSON, program ledger CSV, market model, observation export).

Step 5 — Return digest with cited canon challenge

≤ 200 words: analyzed, top 3 findings (anchored to a canon citation), top 3 next actions (named human owner where applicable), artifact path, and one grill challenge for the user. Examples:

"Your power calc assumes a 0.5 effect size with no published anchor. ICH E9 requires a justified, clinically meaningful difference. Where did 0.5 come from?"
"Your TAM is a single top-down number (1% of a $40B market). Bessemer market-sizing discipline requires a bottoms-up cross-check. What's units × price × adoption?"

Forcing-question library (grill-with-docs pattern)

Grill the user on lane-defining decisions before invoking the sub-skill. One per turn, recommended answer, canon citation:

CLINICAL lane: "Is your primary endpoint a clinical outcome or a surrogate — and if surrogate, is it validated for this indication? Recommended: clinical outcome unless the surrogate is on FDA's validated table. Canon: FDA Surrogate Endpoint Table; BEST glossary."
RD_FINANCE lane: "Is this spend in the research phase or the development phase, and can you evidence technical feasibility? Recommended: research = expense; development = capitalize-candidate only with feasibility evidence, routed to a named finance owner. Canon: IAS 38; ASC 730."
MARKET lane: "Is your TAM top-down or bottoms-up — and have you computed it both ways to triangulate? Recommended: both; reconcile the delta. Canon: Bessemer / a16z market-sizing; Fermi estimation."
PRODUCT lane: "Is this study generative (discover problems) or evaluative (test a solution)? Recommended: name it first; the method follows. Canon: Rohrer's landscape of UX research methods (NN/g)."

Never run a sub-skill until the lane-defining decision is locked.

Onboarding-first (per sub-skill)

Before invoking a sub-skill for the first time in a workspace, point the user at that skill's onboarding questionnaire so the tools run pre-configured to their context:

python3 skills/<sub-skill>/scripts/onboard.py          # interactive Q&A
python3 skills/<sub-skill>/scripts/onboard.py --show    # questions + current config

Each sub-skill has its own question set (clinical: area/alpha/power/dropout/owners · finance: area/F&A/runway/standard/owner · market: profile/confidence/MoE/method · product: profile/insight-threshold/method/stakes). Answers persist to ~/.config/research-ops/<sub-skill>.json (or ./.research-ops/<sub-skill>.json with --scope project) and are consumed automatically by every tool in that skill. Customization is mandatory discipline here, not decoration — surface the onboarding step when a user starts a fresh research workstream.

Autoresearch handoff (isolated, opt-in)

Each sub-skill ships its own skills/<sub-skill>/scripts/ar_evaluator.py — an isolated bridge to engineering/autoresearch-agent. Invoke autoresearch only when the user explicitly asks to "optimize", "improve", or "run a loop". The handoff is per-skill (no shared coupling): the loop edits the skill's input file and the evaluator scores it (clinical → feasibility_composite higher; finance → runway_months higher; market → tam_divergence lower; product → validated_insights higher). Never auto-start a loop; never let the loop edit the evaluator.

Assumptions

User has research authority OR is preparing analysis for someone who does.
User wants deterministic decision support, not the final answer — a clinician approves the protocol, a controller books the entry, the human picks the market number.
Inputs may be partial — every sub-skill ships a templated sample so the user can see the shape before filling in their own.

Non-goals

Not an EDC, clinical-trial-management system, accounting system, survey platform, or research repository.
Does not give clinical, accounting, or legal advice as fact. Every output is a recommendation + named human owner.
Does not store research history across sessions.

Distinct from

research/ (academic) — that domain finds literature, grants, and patents. This domain plans, funds, scopes, and synthesizes research.
ra-qm-team — that's regulatory/QM submission (ISO 13485/14971, MDR, FDA 510(k)/PMA/QSR). clinical-research designs the study; it routes submission out to ra-qm-team.
finance/financial-analysis — that's corporate close + valuation. research-finance manages R&D program/portfolio spend.
research/grants — that's funding discovery. research-finance manages money already won.
product-team — that's persona/journey artifacts, discovery sprints, and live A/B experiments. product-research is the method + repository discipline.
marketing-skill — that's campaign analytics and demand-gen. market-research is upstream methodology.

Output artifacts

Sub-skill	Artifact
clinical-research	`protocol_synopsis.md` + `sample_size.json`
research-finance	`rd_program_budget.md` + `capex_opex_routing.json`
market-research	`market_sizing.md` + `sample_plan.json`
product-research	`research_plan.md` + `insight_synthesis.json`

Anti-patterns (do not)

❌ Present a clinical power/endpoint output as fact — it is an estimate with a named clinical owner
❌ Auto-decide capitalize-vs-expense — route to a named finance owner
❌ Report a market size as a single unsourced number — show method + both-ways triangulation + assumptions
❌ Assert a product insight from a single participant — flag it as an anecdote
❌ Run all 4 sub-skills "to be thorough" — pick one, digest, chain if needed

References

Clinical canon: ICH E8(R1)/E9/E9(R1), CONSORT, SPIRIT, FDA Multiple Endpoints
R&D finance canon: IAS 38, ASC 730, 2 CFR 200, Cooper stage-gate
Market canon: Cochran, Dillman, Kotler, Bessemer market-sizing
Product canon: Nielsen, Guest et al., Christensen JTBD, ResearchOps/Polaris
Path-B build pattern: documentation/implementation/research-ops-expansion-plan.md

Research Ops Skills