AI Skills

name: research description: Use when user explicitly requests deep research or comprehensive analysis requiring 20+ authoritative sources. Creates an agent team for parallel research with source gate enforcement, confidence tracking, and structured synthesis. NOT for simple questions answerable with a single search.

<objective> Comprehensive research using an agent team, web search, and web scraping. Iteratively decomposes topics, gathers evidence from quality sources via parallel researcher teammates, and synthesizes findings into structured reports.

Core principle: Decompose questions, research in parallel with an agent team, evaluate confidence, iterate until sufficient, synthesize with source attribution. </objective>

<quick_start>

Run /research [topic] to start
Research continues automatically until targetSources is met
A Stop hook enforces the source gate—you cannot exit early
On completion, a resource usage report is displayed </quick_start>

<success_criteria> Task is complete when ALL of these are true:

state.json exists with valid JSON
sourcesGathered >= targetSources (primary gate - enforced by Stop hook)
All questions marked "done" with confidence ratings
report.md synthesizes findings with source attribution
phase is "DONE" in state.json
Conflicting information documented with source quality assessment
Gaps and limitations explicitly noted in report </success_criteria>

<when_to_use>

digraph when_research {
    "User request?" [shape=diamond];
    "Needs multiple sources?" [shape=diamond];
    "Quick answer sufficient?" [shape=box];
    "Use research skill" [shape=box];

    "User request?" -> "Needs multiple sources?" [label="deep research\ncomprehensive analysis\nthorough investigation"];
    "User request?" -> "Quick answer sufficient?" [label="simple question"];
    "Needs multiple sources?" -> "Use research skill" [label="yes"];
    "Needs multiple sources?" -> "Quick answer sufficient?" [label="no"];
}

Use when:

User explicitly asks for "deep research" or "comprehensive analysis"
Topic requires multiple authoritative sources
Need to track confidence and identify gaps
Want structured output with source attribution

Don't use when:

Simple factual question (single search sufficient)
User wants quick answer, not exhaustive report
Topic is too narrow for 8-question decomposition </when_to_use>

<required_tools>

Tool / Feature	Purpose	Required
`WebSearch`	Search queries (built-in)	Yes
Agent teams	Spawn parallel researcher teammates	Yes
`firecrawl-mcp:firecrawl_scrape`	Scrape full page content (preferred)	No
`WebFetch`	Fetch page content (built-in fallback)	Fallback

Prerequisite: Agent teams must be enabled (CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in settings or environment).

Tool Selection: In INIT phase, check if firecrawl-mcp:firecrawl_scrape is available. If not, use WebFetch (built-in). Record choice in state.json as "scraper": "firecrawl" or "scraper": "webfetch".

Tradeoffs:

firecrawl-mcp:firecrawl_scrape: Better content extraction, handles JS-rendered pages
WebFetch: Always available, sufficient for static pages </required_tools>

<state_machine>

INIT → DECOMPOSE → RESEARCH → EVALUATE → [RESEARCH or SYNTHESIZE] → DONE

State File: research/{slug}/state.json

{
  "topic": "string",
  "phase": "INIT|DECOMPOSE|RESEARCH|EVALUATE|SYNTHESIZE|DONE",
  "iteration": 0,
  "targetSources": 30,
  "sourcesGathered": 0,
  "totalSearches": 0,
  "teammateCompletions": 0,
  "codexCompletions": 0,
  "findingsCount": 0,
  "startTime": "ISO-8601 timestamp",
  "scraper": "firecrawl|webfetch",
  "questions": [{"id": 1, "text": "...", "status": "pending|done", "confidence": null}]
}

Rule: Read state.json before acting. Write state.json after acting. </state_machine>

<source_gate_enforcement> A Stop hook prevents the session from ending until sourcesGathered >= targetSources. This is a hard gate—you cannot bypass it by rationalizing.

How it works:

When you try to complete, the hook checks state.json
If sources are insufficient, exit is blocked and you're prompted to continue
Once target is met AND phase is DONE, exit is allowed with a resource report

Default target: 30 sources. Adjust in INIT phase based on topic complexity. </source_gate_enforcement>

<state_recovery> On skill invocation, first check for existing state:

If research/{slug}/state.json exists:
- Parse JSON; if invalid, offer to restart
- Resume from current phase
- Notify user: "Resuming research from {phase} phase"
Verify state consistency before resuming:
- RESEARCH: Ensure pending questions exist
- EVALUATE: Ensure findings.json has data
- SYNTHESIZE: Ensure all questions marked "done"
If inconsistent, offer user choice:
- Delete state and restart
- Attempt repair (mark incomplete questions as pending) </state_recovery>

<steps> <phase name="INIT"> 1. Generate slug from topic: - Lowercase the topic - Replace spaces with hyphens - Remove special characters (keep only `a-z`, `0-9`, `-`) - Truncate to 50 characters - Example: "AI in Healthcare 2024!" → `ai-in-healthcare-2024`

Detect available scraper:
- Check if firecrawl-mcp:firecrawl_scrape tool exists
- If firecrawl available → "scraper": "firecrawl"
- If not available → "scraper": "webfetch" (uses built-in WebFetch)
Create working directory:
```
mkdir -p research/{slug}
```
Determine target sources based on topic complexity:
- Narrow topic (specific question): 20 sources
- Standard topic (most research): 30 sources (default)
- Broad topic (comprehensive review): 40 sources

Initialize state files:

state.json:

{
  "topic": "...",
  "phase": "DECOMPOSE",
  "iteration": 0,
  "targetSources": 30,
  "sourcesGathered": 0,
  "totalSearches": 0,
  "teammateCompletions": 0,
  "codexCompletions": 0,
  "findingsCount": 0,
  "startTime": "2024-01-15T10:30:00Z",
  "scraper": "firecrawl|webfetch",
  "questions": []
}

findings.json:

[]

</phase> <phase name="DECOMPOSE"> Generate exactly 8 questions covering these angles:

#	Angle	Example
1	Definition/background	What is X? History and context?
2	Current state	What's happening now? Recent developments (last 1-2 years)?
3	Key entities	Who are the main people, companies, organizations?
4	Core mechanisms	How does it work? What are the processes?
5	Evidence and data	What studies, statistics, data exist?
6	Criticisms and limitations	What are the problems, risks, downsides?
7	Comparisons	How does it compare to alternatives?
8	Future developments	What's coming next? Predictions?

Add questions to state.json with status="pending". Set phase="RESEARCH". </phase>

<phase name="RESEARCH"> **Pre-flight:** Check if `codex` CLI is available (`command -v codex`). Output a status line: - If available: `"Codex detected — launching parallel background validation."` - If not: `"Codex not available — Claude-only research."`

Create an agent team to research pending questions in parallel. For each pending question, spawn both a Claude teammate AND a Codex background job.

Claude teammates: One per pending question (up to 8 at a time). Each works independently with its own context window.

Codex background jobs: For each pending question, write a prompt file and launch via run-engine.sh:

# For each question:
# 1. Write prompt to research/{slug}/codex-q{id}-prompt.txt
# 2. Launch: ./scripts/run-engine.sh codex research/{slug}/codex-q{id}-prompt.txt research/{slug}/codex-q{id}.json --timeout 120 &

The Codex prompt should instruct Codex to research the question using web search and return the same JSON format as the teammate return format below. Codex has MCP tool access including web search — its answers provide independent redundancy for cross-validation.

The prompt file should contain:

Research the following question using web search. Run at least 3 different search queries, scrape the top results, and extract specific facts with source URLs.

QUESTION: {QUESTION}

Return ONLY this JSON:
{"questionId": {ID}, "questionText": "{QUESTION}", "searchQueries": [...], "searchesRun": N, "urlsScraped": N, "scrapeFailures": [], "findings": [{"fact": "...", "sourceUrl": "...", "tier": 1}], "gaps": ["..."], "contradictions": ["..."], "confidence": "high|medium|low", "confidenceReason": "..."}

Read scraper from state.json and use the appropriate instructions when spawning each Claude teammate:

<teammate_instructions scraper="firecrawl"> You are a researcher teammate with access to WebSearch and firecrawl-mcp:firecrawl_scrape.

TASK: {QUESTION}

PROCESS:

Run exactly 4 searches:
- Core query
- Add "research" or "study"
- Add current year or "recent"
- Rephrase with synonyms
Rank URLs by quality:
- Tier 1: .gov, .edu, journals, official docs
- Tier 2: Reuters, AP, BBC, industry publications
- Tier 3: Company blogs, Wikipedia
- Skip: Forums, social media, SEO spam
Select top 4 URLs (prefer Tier 1-2)
Use firecrawl-mcp:firecrawl_scrape on each. Continue if one fails.
Extract specific facts with sources. </teammate_instructions>

<teammate_instructions scraper="webfetch"> You are a researcher teammate with access to WebSearch and WebFetch.

TASK: {QUESTION}

PROCESS:

Run exactly 4 searches:
- Core query
- Add "research" or "study"
- Add current year or "recent"
- Rephrase with synonyms
Rank URLs by quality:
- Tier 1: .gov, .edu, journals, official docs
- Tier 2: Reuters, AP, BBC, industry publications
- Tier 3: Company blogs, Wikipedia
- Skip: Forums, social media, SEO spam
Select top 4 URLs (prefer Tier 1-2)
Use WebFetch on each with a prompt like "Extract the main content and key facts from this page". Continue if one fails.
Extract specific facts with sources. </teammate_instructions>

<teammate_return_format> RETURN ONLY THIS JSON:

{
  "questionId": {ID},
  "questionText": "{QUESTION}",
  "searchQueries": ["query1", "query2", "query3", "query4"],
  "searchesRun": 4,
  "urlsScraped": 4,
  "scrapeFailures": [],
  "findings": [{"fact": "...", "sourceUrl": "...", "tier": 1}],
  "gaps": ["what you couldn't find"],
  "contradictions": ["X says A, Y says B"],
  "confidence": "high|medium|low",
  "confidenceReason": "..."
}

</teammate_return_format>

After each Claude teammate completes:

Validate JSON. Retry once if malformed.
Append to findings.json
Update state.json:
- Mark question done
- Increment totalSearches by searchesRun from response
- Increment teammateCompletions by 1
- Increment sourcesGathered by urlsScraped from response
- Increment findingsCount by length of findings array from response
Log progress: "Sources: {sourcesGathered}/{targetSources}"

After all Claude teammates AND Codex background jobs complete:

wait for all background Codex processes
Read each research/{slug}/codex-q{id}.json — skip any with status starting with "skipped"
Increment codexCompletions in state.json for each non-skipped result
Report a single Codex results status line: "Codex results: q1 ✓ q2 ✓ q3 ⏭ (timed out)" Use ✓ for completed, ⏭ for skipped/timed out.
Spawn the core:synthesizer agent in research mode. Provide:
- All Claude teammate findings from findings.json
- All Codex results from research/{slug}/codex-q{id}.json
- Instructions to operate in research mode
The synthesizer will cross-validate facts across both engines and return merged findings
Write synthesizer output to research/{slug}/synthesis.json (keep original findings.json intact for report generation)
Set phase="EVALUATE" </phase>

<phase name="EVALUATE"> Calculate metrics:

Metric	Calculation
`sourcesGathered`	from state.json (primary gate)
`targetSources`	from state.json
`avgConfidence`	high=3, medium=2, low=1, average all
`significantGaps`	unique gaps across findings

Decision table (two-stage):

Stage 1: Source Gate (MANDATORY)

sourcesGathered >= targetSources	→ Action
No	RESEARCH (forced, cannot proceed)
Yes	Continue to Stage 2

You MUST gather enough sources before considering other criteria.

Stage 2: Quality Gate (only if Stage 1 passes)

avgConfidence >= 2.5 AND gaps <= 2	→ Decision
Yes	SYNTHESIZE
No	RESEARCH (generate follow-ups)

Note: The Stop hook enforces the source gate—you cannot exit until target is met.

If continuing to RESEARCH:

Generate max 4 follow-up questions from gaps/contradictions
Add to questions with status="pending"
Increment iteration
Set phase="RESEARCH"
Log: "Continuing research: {sourcesGathered}/{targetSources} sources, need more to meet target" </phase>

<phase name="SYNTHESIZE"> Write `report.md`:

# {Topic}

## Executive Summary
[300-400 words. Most important finding first. State confidence. Note caveats.]

## Background
[200 words. Key terms. Context.]

## Key Findings

### [Theme 1]
[Grouped findings. Inline citations. Note source strength.]

### [Theme 2]
[3-5 themes total]

## Conflicting Information
[Both sides. Which has better sourcing.]

## Gaps & Limitations
[What's unknown. What needs more research.]

## Source Assessment
- **High confidence:** [claims with 3+ quality sources]
- **Medium confidence:** [claims with 1-2 sources]
- **Low confidence:** [single source or Tier 3 only]

## Sources

### Primary
[Tier 1 sources with URLs]

### Secondary
[Tier 2-3 sources with URLs]

---
*Sources: {sourcesGathered} | Searches: {totalSearches} | Teammates: {teammateCompletions} | Iterations: {iteration} | Duration: {duration} | Date: {date}*

Set phase="DONE".

On completion: The Stop hook will display a resource usage summary when you exit. </phase>

</steps>

<error_handling>

Error	Action
Malformed JSON	Retry once, then mark low confidence
Scrape fails	Continue with other URLs
Rate limit	Wait 60s, reduce batch to 2
No results	Mark low confidence, rephrase as follow-up
Tool not found	Fall back to WebFetch, update state.json
</error_handling>

<limits> | Resource | Default | Notes | |----------|---------|-------| | Target sources | 30 | Adjustable in INIT (20-40 based on complexity) | | Teammates per batch | 8 | Parallel research questions (one teammate per question) | | URLs per teammate | 4 | Sources scraped per question | | Follow-ups per iteration | 4 | New questions from gaps |

No hard iteration or search limits. The source gate is the primary constraint. Research continues until sourcesGathered >= targetSources. </limits>

<red_flags> STOP if you catch yourself thinking any of these:

Thought	Reality
"I have high confidence, I can skip the source target"	The Stop hook will block you. Gather the sources—it's non-negotiable.
"This topic is too broad for 8 questions"	Narrow the scope first. Don't start research on vague topics.
"I'll just synthesize what I have"	Check `sourcesGathered >= targetSources`. If not met, you cannot proceed.
"I don't need to update state.json"	You will lose track. Always read/write state.json.
"All sources are equal"	Weight Tier 1 sources higher in synthesis.
"I'm stuck, I'll just finish"	Narrow the scope or generate better follow-up questions. The source gate is non-negotiable.
</red_flags>

Name	research
Description	Use when user explicitly requests deep research or comprehensive analysis requiring 20+ authoritative sources. Creates an agent team for parallel research with source gate enforcement, confidence tracking, and structured synthesis. NOT for simple questions answerable with a single search.

research

SKILL.md