Litreview

Academic literature orientation skill that searches papers via Consensus, builds a strategic search plan using PICO (default) or SPIDER / Decomposition / hybrid as fallbacks, and synthesizes findings into a formatted Word (.docx) research guide. Grill-me intake (research question specificity + framework hint + tentative depth) before the recon search; a second forcing checkpoint after Phase 2 confirms framework + sub-areas + depth before searches consume budget. Configurable depth (5/10/20 queries) controls coverage vs. speed. Output is a 'launching pad' — an orientation guide that lets a researcher dive in confidently, not a finished review. Use when the user starts literature-oriented research (e.g., 'litreview on [topic]', 'literature review on [topic]', 'I'm starting a literature review on X', 'I'm writing a paper on X', 'help me research X', 'I'm doing research on X', 'can you help me research X'). Do NOT use for single one-off paper searches wanting a quick list — that's a plain Consensus search.

Published by @Alireza Rezvani·from alirezarezvani/claude-skills·0 agent reads / 30d·0 saves·

Litreview — Academic Literature Orientation

Portability: Requires a Consensus MCP connection, Node.js with docx package for document generation, and (in CLI) bash_tool. Works in Claude Code CLI natively. In Claude.ai with Consensus MCP + Code Execution, the workflow is supported.

Produce a launching pad — not a finished literature review, but an orientation document that gives a researcher entering an unfamiliar field everything they need to start reading and searching with confidence. Think: what a generous colleague who knows the field would tell you over coffee.

Agent Integrity Rules (Research-Pack Convention)

Inherited from the research-pack convention; locked verbatim per PR #657's cross-skill consistency audit.

  • Source discipline. Only cite Consensus-returned papers from THIS session. Training knowledge labeled [Not from Consensus — model knowledge] and excluded from cited count. Sparse results stated explicitly, never silently filled.
  • Counting discipline. Three numbers tracked: searches executed / unique papers received (deduplicated) / papers cited. Every cited paper has a retrievable Consensus URL from this session. Use scripts/citation_tracker.py for deterministic counts.
  • Tool constraints. Consensus per-query cap depends on plan tier. Detect at first search, report at checkpoint. Rate limit is 1 query/sec — sequential execution mandatory.
  • Retry policy. On failure → wait 3s → retry once → log. After 3 consecutive failures: stop, alert user, share what was collected.
  • Plan-tier detection. Parse first-search response for "Showing top 10" / "upgrade" → free tier (10/search). 20 returned → Pro (20/search). Calculate theoretical ceiling and surface at checkpoint so user can recalibrate.

See references/search_budget_allocation.md for the sequential-execution rationale + plan-tier signals.

Error Handling

FailureBehavior
Consensus rate-limit hitWait 3s, retry once, log outcome
Search returns 0 resultsNote explicitly; "either niche terminology or genuine gap"; never silently fill
Plan-tier cap detectedLog tier; report at checkpoint; surface in audit
3 consecutive failuresStop searching, alert user, share what's collected, ask how to proceed
Sub-area returns thin results (<5 papers)Flag in audit; suggest manual PubMed/Scholar supplementation
User wants to adjust sub-areasUpdate table, re-confirm before searching
DOCX validation failsUnpack XML, fix, repack

Phase 0: Grill-Me Intake (3 forcing questions, one at a time)

Each question carries explicit "why I'm asking". Stop condition: max 3 before Phase 1.

Q1 (root) — Research question specificity

State the research question in 1–2 sentences. Specific is better — "How do LLMs perform on clinical reasoning tasks compared to physicians?" beats "AI in medicine". Vague questions produce vague reviews.

Why I'm asking: The reconnaissance search hinges on precise terminology. Vague questions produce thin recon results that don't yield a useful framework breakdown.

Refuse mush. Re-ask once with examples if user is too broad. If still vague, deliver with explicit "broad-scope orientation, not depth review" caveat.

Q2 (depends on Q1) — Framework hint

Framework — pick one or say "you pick":

  1. PICO (Population / Intervention / Comparison / Outcome — most clinical questions)
  2. SPIDER (Sample / Phenomenon / Design / Evaluation / Research-type — social/qualitative)
  3. Decomposition (Problem / Solution / Evaluation / Limitations — technology-focused)
  4. Hybrid (you pick which components from which framework)
  5. You pick — analyze Q1 and recommend

Why I'm asking: PICO is the default for ~70% of clinical questions but maps poorly to qualitative work or technology evaluation. Picking upfront saves the recon search from suggesting a misaligned framework.

Forcing choice with default ("you pick"). The skill surfaces its own framework recommendation after the recon search so user can override. Use scripts/framework_recommender.py for the heuristic.

See references/framework_selection.md for PICO / SPIDER / Decomposition canon.

Q3 (depends on Q1) — Tentative depth

Tentative depth — pick one. Final confirmation comes after the framework breakdown:

  1. Quick scan (5 searches)
  2. Standard review (10 searches)
  3. Deep dive (20 searches)

Why I'm asking: I ask this twice — once now to calibrate the recon search emphasis, once after the framework breakdown to confirm. Tentative answer affects which sub-areas to surface first; final answer drives search budget allocation.

Forcing choice. Re-asked at the post-Phase-2 checkpoint after the user has seen the framework breakdown.

Stop condition: 3 questions max before Phase 1. The post-Phase-2 checkpoint is its own grill-me moment (framework table + sub-area-adjustment + depth-reconfirmation).

Phase 1: Initial Reconnaissance

One broad Consensus search to map themes, terminology, methodological distinctions.

  • Query: broad version of Q1 (terminology variants are okay; first search casts wide)
  • Record: citation_tracker.py --action record_search --session NAME --query "..."
  • Record received count: citation_tracker.py --action record_papers_received --session NAME --count N
  • Detect plan tier from response: "Showing top 10" / "upgrade" → free; 20 returned → Pro

Synthesize for the checkpoint:

  • Themes that surfaced
  • Terminology variations (e.g., "LLM" vs "large language model" vs "GPT-style model")
  • Methodological distinctions (clinical trials vs benchmark eval vs case study)
  • Coverage gaps (sub-questions absent from recon results)

Phase 2: Framework Selection + Sub-area Generation

Choose framework (from Q2 OR override based on recon):

  • PICO — most clinical questions (~70% default)
  • SPIDER — social / qualitative
  • Decomposition — technology focus (Problem / Solution / Evaluation / Limitations)
  • Hybrid — explicit cross-framework mapping

Generate 4-5 sub-area questions mapped to framework components. Each becomes a targeted Phase 3 search.

Checkpoint (grill-me forcing-options moment)

After Phase 2, halt and present:

3-4 sentence recon summary

  • What themes surfaced
  • Terminology landscape
  • Evidence landscape characterization

Framework breakdown table

Framework ComponentHow It Maps to This TopicProposed Sub-area to Explore
(Component 1)...Sub-area 1
(Component 2)...Sub-area 2
(Component 3)...Sub-area 3
(Component 4)...Sub-area 4
Cross-cutting theme...Sub-area 5

Depth re-confirmation (forcing choice)

Surface the practical constraint: detected plan tier + theoretical ceiling.

  • Quick scan (5 searches × ~10 results each = ~50 papers max)
  • Standard review (10 searches × ~10 = ~100 papers)
  • Deep dive (20 searches × ~10 = ~200 papers)

Sub-area forcing options

  • "Looks good — proceed with these sub-areas"
  • "Adjust: add sub-area on [X]"
  • "Adjust: remove and replace [Y] with [Z]"
  • "Restart with different framework"

Why I'm asking (the rationale)

A wrong framework or sub-area set wastes the search budget. This is the last cheap moment to correct course.

Wait for user response before Phase 3. Refuse to start Phase 3 without explicit user choice.

Phase 3: Targeted Searches

Sequential (1 query/sec), budget per depth tier. See references/search_budget_allocation.md for full canon.

Quick scan (5 searches)

  • 5 sub-area searches (one per sub-area)
  • Skip era-gated + review-specific

Standard review (10 searches)

  • 5 sub-area searches
  • 2 review article searches (top 2 sub-areas): "systematic review [topic]" / "meta-analysis [topic]"
  • 2 era-gated searches (most important sub-area): year_max: 2015 + year_min: 2021
  • 1 follow-up on highest-cited paper using its key terms + year_min after publication

Deep dive (20 searches)

  • 5 sub-area searches
  • 5 review article searches (one per sub-area)
  • 4 era-gated searches (top 2 sub-areas, old + new each)
  • 3 follow-ups on top 3 highest-cited papers
  • 3 spare for emerging threads (surprising findings to chase)

Throughout: 1 q/sec rate limit. Sequential. Confirm response before next call. Record each via citation_tracker.py.

Cross-Search Intelligence

Three trackers across ALL search results — run scripts/cross_search_aggregator.py --session NAME after Phase 3 completes:

  1. Repeat-hit papers — same paper appearing in 3+ sub-area searches = likely foundational
  2. Recurring authors — same author in multiple searches = dominant research group; top 3-5 most frequent matter
  3. Citation-per-year heuristic — a 2023 paper with 150 citations >> 2008 paper with 150 citations. Use for seminal-work identification.

These feed the "Start Here" + "Key Research Groups" + "Bibliography" DOCX sections.

Phase 4: DOCX Research Guide

Generate via Node.js + docx library. 8 sections (see references/docx_8_sections.md for full spec):

  1. Topic Overview — single tight paragraph (4-6 sentences)
  2. Start Here — Priority Reading Order — 5-7 papers ordered: best recent review → foundational → 2-3 frontier → gap/controversy. Each: hyperlinked title + authors/year + 1-sentence contribution + 1-sentence "what to look for"
  3. How the Field Got Here — chronological narrative (1-2 paragraphs) + timeline table (5-8 milestones: Year / Milestone / Significance) + terminology evolution note
  4. Sub-area Guides (one per sub-area, 4 parts each)
    • 4a. What the Research Shows (2-3 sentence synthesis with inline citations)
    • 4b. Key Papers (3-5 hyperlinked papers with citation count, year, 1-sentence importance)
    • 4c. Key Search Terms (6-10 keywords, synonyms, MeSH, historical terms)
    • 4d. Boolean Search Strings (2-3 ready-to-paste strings)
  5. Key Research Groups — top 3-5 authors/groups with affiliations, sub-area coverage, representative paper link (from cross-search aggregator)
  6. Open Questions & Gaps — three categories: methodological / population-context / conceptual-theoretical. Each gap explains why it matters.
  7. Bibliography — alphabetical by first author. Every entry has clickable "View on Consensus" link. Every inline citation matches a bibliography entry.
  8. Audit Log — search summary table (#, query, filters, papers returned, status), counts block, coverage notes including detected tier and theoretical ceiling

DOCX Technical Requirements

Document the key docx library patterns:

  • Page: US Letter, 1-inch margins
  • Lists: LevelFormat.BULLET (never unicode bullets)
  • Hyperlinks: ExternalHyperlink with style: "Hyperlink", full URL (never truncated)
  • Tables: dual widths (columnWidths + cell width), ShadingType.CLEAR
  • Validation step after save (zip-integrity check: python3 -c "import zipfile,sys; zipfile.ZipFile(sys.argv[1]).testzip()" output.docx — no output = intact — then confirm the required sections are present)

Reference the docx skill for setup patterns and best practices.

Output

research_guide_<topic-slug>_<YYYY-MM-DD>.docx

Plus:

  • Chat summary block: "Saved: . Audit: N searches × M unique papers / K cited. Plan tier: ."
  • Audit log printed inline if user asks for it

Tooling

ScriptRole
scripts/citation_tracker.pyJSON-backed three-count audit at ~/.litreview_sessions/<session>.json
scripts/framework_recommender.pyHeuristic PICO/SPIDER/Decomposition suggestion from research question
scripts/cross_search_aggregator.pyRepeat-hits + recurring-authors + citation-per-year ranking after Phase 3

References

  • references/framework_selection.md — PICO / SPIDER / Decomposition canon (7+ sources)
  • references/search_budget_allocation.md — depth tiers + cross-search intelligence + sequential execution rationale (7+ sources)
  • references/docx_8_sections.md — research guide DOCX spec + technical requirements (7+ sources)

Anti-Patterns To Reject

  • Parallelizing Consensus calls
  • Skipping the interactive checkpoint (running all searches without user confirmation)
  • Padding thin results with training knowledge
  • Defaulting to non-PICO framework without justification
  • Citing papers in chat that didn't come from Consensus this session
  • Hardcoding plan tier instead of detecting from first response
  • Skipping era-gated searches in standard/deep budgets
  • Skipping cross-search intelligence (repeat-hits, recurring authors)
  • Truncating Consensus URLs in hyperlinks

Version: 1.0.0 Source spec: megaprompts/09-litreview-megaprompt.md Build pattern: Path B (direct conversion). Sibling of pulse (research-pack shape).

Bundled with this artifact

8 files

Reference files that ship alongside this artifact. Agents pull these in only when the task needs them.

More on the bench

SKILL0

Torch Geometric

PyTorch Geometric (PyG) for graph neural networks — node/link/graph classification, message passing (GCN, GAT, GraphSAGE, GIN), heterogeneous graphs, neighbor sampling, and custom datasets. Use when working with torch_geometric, not for general NetworkX analytics or non-graph PyTorch models.

data-science-ml+2
0
SKILL0

Scvelo

RNA velocity analysis with scVelo. Estimate cell state transitions from unspliced/spliced mRNA dynamics, infer trajectory directions, compute latent time, and identify driver genes in single-cell RNA-seq data. Complements Scanpy/scVI-tools for trajectory inference.

data-science-ml+2
0
SKILL0

Scikit Bio

Biological data toolkit. Sequence analysis, alignments, phylogenetic trees, diversity metrics (alpha/beta, UniFrac), ordination (PCoA), PERMANOVA, FASTA/Newick I/O, for microbiome analysis.

data-science-ml+2
0