Setup

Set up a new autoresearch experiment interactively. Collects domain, target file, eval command, metric, direction, and evaluator. Use when the user runs /ar:setup or asks to start optimizing a file with the autoresearch loop.

Published by @Alireza Rezvani·0 agent reads / 30d·0 saves·

/ar:setup — Create New Experiment

Set up a new autoresearch experiment with all required configuration.

Usage

/ar:setup                                    # Interactive mode
/ar:setup engineering api-speed src/api.py "pytest bench.py" p50_ms lower
/ar:setup --list                             # Show existing experiments
/ar:setup --list-evaluators                  # Show available evaluators

What It Does

If arguments provided

Pass them directly to the setup script:

python {skill_path}/scripts/setup_experiment.py \
  --domain {domain} --name {name} \
  --target {target} --eval "{eval_cmd}" \
  --metric {metric} --direction {direction} \
  [--evaluator {evaluator}] [--scope {scope}]

If no arguments (interactive mode)

Collect each parameter one at a time:

  1. Domain — Ask: "What domain? (engineering, marketing, content, prompts, custom)"
  2. Name — Ask: "Experiment name? (e.g., api-speed, blog-titles)"
  3. Target file — Ask: "Which file to optimize?" Verify it exists.
  4. Eval command — Ask: "How to measure it? (e.g., pytest bench.py, python evaluate.py)"
  5. Metric — Ask: "What metric does the eval output? (e.g., p50_ms, ctr_score)"
  6. Direction — Ask: "Is lower or higher better?"
  7. Evaluator (optional) — Show built-in evaluators. Ask: "Use a built-in evaluator, or your own?"
  8. Scope — Ask: "Store in project (.autoresearch/) or user (~/.autoresearch/)?"

Then run setup_experiment.py with the collected parameters.

Listing

# Show existing experiments
python {skill_path}/scripts/setup_experiment.py --list

# Show available evaluators
python {skill_path}/scripts/setup_experiment.py --list-evaluators

Built-in Evaluators

NameMetricUse Case
benchmark_speedp50_ms (lower)Function/API execution time
benchmark_sizesize_bytes (lower)File, bundle, Docker image size
test_pass_ratepass_rate (higher)Test suite pass percentage
build_speedbuild_seconds (lower)Build/compile/Docker build time
memory_usagepeak_mb (lower)Peak memory during execution
llm_judge_contentctr_score (higher)Headlines, titles, descriptions
llm_judge_promptquality_score (higher)System prompts, agent instructions
llm_judge_copyengagement_score (higher)Social posts, ad copy, emails

After Setup

Report to the user:

  • Experiment path and branch name
  • Whether the eval command worked and the baseline metric
  • Suggest: "Run /ar:run {domain}/{name} to start iterating, or /ar:loop {domain}/{name} for autonomous mode."

Bundled with this artifact

1 file

Reference files that ship alongside this artifact. Agents pull these in only when the task needs them.

More on the bench

SKILL0

Azure AI Vision Imageanalysis Py

Azure AI Vision Image Analysis SDK for captions, tags, objects, OCR, people detection, and smart cropping. Use for computer vision and image understanding tasks.

software-engineering+2
0
SKILL0

Zustand Store Ts

Create Zustand stores following established patterns with proper TypeScript types and middleware.

ai-prompt-engineering+3
0
SKILL0

Zoom Automation

Automate Zoom meeting creation, management, recordings, webinars, and participant tracking via Rube MCP (Composio). Always search tools first for current schemas.

ai-prompt-engineering+3
0