Agent Skill
2/7/2026

research

Use when user explicitly requests deep research or comprehensive analysis requiring 20+ authoritative sources. Creates an agent team for parallel research with source gate enforcement, confidence tracking, and structured synthesis. NOT for simple questions answerable with a single search.

G
guyathomas
0GitHub Stars
1Views
npx skills add guyathomas/workflows

SKILL.md

Nameresearch
DescriptionUse when user explicitly requests deep research or comprehensive analysis requiring 20+ authoritative sources. Creates an agent team for parallel research with source gate enforcement, confidence tracking, and structured synthesis. NOT for simple questions answerable with a single search.

name: research description: Use when user explicitly requests deep research or comprehensive analysis requiring 20+ authoritative sources. Creates an agent team for parallel research with source gate enforcement, confidence tracking, and structured synthesis. NOT for simple questions answerable with a single search.

<objective> Comprehensive research using an agent team, web search, and web scraping. Iteratively decomposes topics, gathers evidence from quality sources via parallel researcher teammates, and synthesizes findings into structured reports.

Core principle: Decompose questions, research in parallel with an agent team, evaluate confidence, iterate until sufficient, synthesize with source attribution. </objective>

<quick_start>

  1. Run /research [topic] to start
  2. Research continues automatically until targetSources is met
  3. A Stop hook enforces the source gate—you cannot exit early
  4. On completion, a resource usage report is displayed </quick_start>

<success_criteria> Task is complete when ALL of these are true:

  • state.json exists with valid JSON
  • sourcesGathered >= targetSources (primary gate - enforced by Stop hook)
  • All questions marked "done" with confidence ratings
  • report.md synthesizes findings with source attribution
  • phase is "DONE" in state.json
  • Conflicting information documented with source quality assessment
  • Gaps and limitations explicitly noted in report </success_criteria>

<when_to_use>

digraph when_research {
    "User request?" [shape=diamond];
    "Needs multiple sources?" [shape=diamond];
    "Quick answer sufficient?" [shape=box];
    "Use research skill" [shape=box];

    "User request?" -> "Needs multiple sources?" [label="deep research\ncomprehensive analysis\nthorough investigation"];
    "User request?" -> "Quick answer sufficient?" [label="simple question"];
    "Needs multiple sources?" -> "Use research skill" [label="yes"];
    "Needs multiple sources?" -> "Quick answer sufficient?" [label="no"];
}

Use when:

  • User explicitly asks for "deep research" or "comprehensive analysis"
  • Topic requires multiple authoritative sources
  • Need to track confidence and identify gaps
  • Want structured output with source attribution

Don't use when:

  • Simple factual question (single search sufficient)
  • User wants quick answer, not exhaustive report
  • Topic is too narrow for 8-question decomposition </when_to_use>

<required_tools>

Tool / FeaturePurposeRequired
WebSearchSearch queries (built-in)Yes
Agent teamsSpawn parallel researcher teammatesYes
firecrawl-mcp:firecrawl_scrapeScrape full page content (preferred)No
WebFetchFetch page content (built-in fallback)Fallback

Prerequisite: Agent teams must be enabled (CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in settings or environment).

Tool Selection: In INIT phase, check if firecrawl-mcp:firecrawl_scrape is available. If not, use WebFetch (built-in). Record choice in state.json as "scraper": "firecrawl" or "scraper": "webfetch".

Tradeoffs:

  • firecrawl-mcp:firecrawl_scrape: Better content extraction, handles JS-rendered pages
  • WebFetch: Always available, sufficient for static pages </required_tools>

<state_machine>

INIT → DECOMPOSE → RESEARCH → EVALUATE → [RESEARCH or SYNTHESIZE] → DONE

State File: research/{slug}/state.json

{
  "topic": "string",
  "phase": "INIT|DECOMPOSE|RESEARCH|EVALUATE|SYNTHESIZE|DONE",
  "iteration": 0,
  "targetSources": 30,
  "sourcesGathered": 0,
  "totalSearches": 0,
  "teammateCompletions": 0,
  "codexCompletions": 0,
  "findingsCount": 0,
  "startTime": "ISO-8601 timestamp",
  "scraper": "firecrawl|webfetch",
  "questions": [{"id": 1, "text": "...", "status": "pending|done", "confidence": null}]
}

Rule: Read state.json before acting. Write state.json after acting. </state_machine>

<source_gate_enforcement> A Stop hook prevents the session from ending until sourcesGathered >= targetSources. This is a hard gate—you cannot bypass it by rationalizing.

How it works:

  1. When you try to complete, the hook checks state.json
  2. If sources are insufficient, exit is blocked and you're prompted to continue
  3. Once target is met AND phase is DONE, exit is allowed with a resource report

Default target: 30 sources. Adjust in INIT phase based on topic complexity. </source_gate_enforcement>

<state_recovery> On skill invocation, first check for existing state:

  1. If research/{slug}/state.json exists:

    • Parse JSON; if invalid, offer to restart
    • Resume from current phase
    • Notify user: "Resuming research from {phase} phase"
  2. Verify state consistency before resuming:

    • RESEARCH: Ensure pending questions exist
    • EVALUATE: Ensure findings.json has data
    • SYNTHESIZE: Ensure all questions marked "done"
  3. If inconsistent, offer user choice:

    • Delete state and restart
    • Attempt repair (mark incomplete questions as pending) </state_recovery>
<steps> <phase name="INIT"> 1. Generate slug from topic: - Lowercase the topic - Replace spaces with hyphens - Remove special characters (keep only `a-z`, `0-9`, `-`) - Truncate to 50 characters - Example: "AI in Healthcare 2024!" → `ai-in-healthcare-2024`
  1. Detect available scraper:

    • Check if firecrawl-mcp:firecrawl_scrape tool exists
    • If firecrawl available → "scraper": "firecrawl"
    • If not available → "scraper": "webfetch" (uses built-in WebFetch)
  2. Create working directory:

    mkdir -p research/{slug}
    
  3. Determine target sources based on topic complexity:

    • Narrow topic (specific question): 20 sources
    • Standard topic (most research): 30 sources (default)
    • Broad topic (comprehensive review): 40 sources
  4. Initialize state files:

    state.json:

    {
      "topic": "...",
      "phase": "DECOMPOSE",
      "iteration": 0,
      "targetSources": 30,
      "sourcesGathered": 0,
      "totalSearches": 0,
      "teammateCompletions": 0,
      "codexCompletions": 0,
      "findingsCount": 0,
      "startTime": "2024-01-15T10:30:00Z",
      "scraper": "firecrawl|webfetch",
      "questions": []
    }
    

    findings.json:

    []
    
</phase> <phase name="DECOMPOSE"> Generate exactly 8 questions covering these angles:
#AngleExample
1Definition/backgroundWhat is X? History and context?
2Current stateWhat's happening now? Recent developments (last 1-2 years)?
3Key entitiesWho are the main people, companies, organizations?
4Core mechanismsHow does it work? What are the processes?
5Evidence and dataWhat studies, statistics, data exist?
6Criticisms and limitationsWhat are the problems, risks, downsides?
7ComparisonsHow does it compare to alternatives?
8Future developmentsWhat's coming next? Predictions?

Add questions to state.json with status="pending". Set phase="RESEARCH". </phase>

<phase name="RESEARCH"> **Pre-flight:** Check if `codex` CLI is available (`command -v codex`). Output a status line: - If available: `"Codex detected — launching parallel background validation."` - If not: `"Codex not available — Claude-only research."`

Create an agent team to research pending questions in parallel. For each pending question, spawn both a Claude teammate AND a Codex background job.

Claude teammates: One per pending question (up to 8 at a time). Each works independently with its own context window.

Codex background jobs: For each pending question, write a prompt file and launch via run-engine.sh:

# For each question:
# 1. Write prompt to research/{slug}/codex-q{id}-prompt.txt
# 2. Launch: ./scripts/run-engine.sh codex research/{slug}/codex-q{id}-prompt.txt research/{slug}/codex-q{id}.json --timeout 120 &

The Codex prompt should instruct Codex to research the question using web search and return the same JSON format as the teammate return format below. Codex has MCP tool access including web search — its answers provide independent redundancy for cross-validation.

The prompt file should contain:

Research the following question using web search. Run at least 3 different search queries, scrape the top results, and extract specific facts with source URLs.

QUESTION: {QUESTION}

Return ONLY this JSON:
{"questionId": {ID}, "questionText": "{QUESTION}", "searchQueries": [...], "searchesRun": N, "urlsScraped": N, "scrapeFailures": [], "findings": [{"fact": "...", "sourceUrl": "...", "tier": 1}], "gaps": ["..."], "contradictions": ["..."], "confidence": "high|medium|low", "confidenceReason": "..."}

Read scraper from state.json and use the appropriate instructions when spawning each Claude teammate:

<teammate_instructions scraper="firecrawl"> You are a researcher teammate with access to WebSearch and firecrawl-mcp:firecrawl_scrape.

TASK: {QUESTION}

PROCESS:

  1. Run exactly 4 searches:

    • Core query
    • Add "research" or "study"
    • Add current year or "recent"
    • Rephrase with synonyms
  2. Rank URLs by quality:

    • Tier 1: .gov, .edu, journals, official docs
    • Tier 2: Reuters, AP, BBC, industry publications
    • Tier 3: Company blogs, Wikipedia
    • Skip: Forums, social media, SEO spam
  3. Select top 4 URLs (prefer Tier 1-2)

  4. Use firecrawl-mcp:firecrawl_scrape on each. Continue if one fails.

  5. Extract specific facts with sources. </teammate_instructions>

<teammate_instructions scraper="webfetch"> You are a researcher teammate with access to WebSearch and WebFetch.

TASK: {QUESTION}

PROCESS:

  1. Run exactly 4 searches:

    • Core query
    • Add "research" or "study"
    • Add current year or "recent"
    • Rephrase with synonyms
  2. Rank URLs by quality:

    • Tier 1: .gov, .edu, journals, official docs
    • Tier 2: Reuters, AP, BBC, industry publications
    • Tier 3: Company blogs, Wikipedia
    • Skip: Forums, social media, SEO spam
  3. Select top 4 URLs (prefer Tier 1-2)

  4. Use WebFetch on each with a prompt like "Extract the main content and key facts from this page". Continue if one fails.

  5. Extract specific facts with sources. </teammate_instructions>

<teammate_return_format> RETURN ONLY THIS JSON:

{
  "questionId": {ID},
  "questionText": "{QUESTION}",
  "searchQueries": ["query1", "query2", "query3", "query4"],
  "searchesRun": 4,
  "urlsScraped": 4,
  "scrapeFailures": [],
  "findings": [{"fact": "...", "sourceUrl": "...", "tier": 1}],
  "gaps": ["what you couldn't find"],
  "contradictions": ["X says A, Y says B"],
  "confidence": "high|medium|low",
  "confidenceReason": "..."
}

</teammate_return_format>

After each Claude teammate completes:

  1. Validate JSON. Retry once if malformed.
  2. Append to findings.json
  3. Update state.json:
    • Mark question done
    • Increment totalSearches by searchesRun from response
    • Increment teammateCompletions by 1
    • Increment sourcesGathered by urlsScraped from response
    • Increment findingsCount by length of findings array from response
  4. Log progress: "Sources: {sourcesGathered}/{targetSources}"

After all Claude teammates AND Codex background jobs complete:

  1. wait for all background Codex processes
  2. Read each research/{slug}/codex-q{id}.json — skip any with status starting with "skipped"
  3. Increment codexCompletions in state.json for each non-skipped result
  4. Report a single Codex results status line: "Codex results: q1 ✓ q2 ✓ q3 ⏭ (timed out)" Use ✓ for completed, ⏭ for skipped/timed out.
  5. Spawn the core:synthesizer agent in research mode. Provide:
    • All Claude teammate findings from findings.json
    • All Codex results from research/{slug}/codex-q{id}.json
    • Instructions to operate in research mode
  6. The synthesizer will cross-validate facts across both engines and return merged findings
  7. Write synthesizer output to research/{slug}/synthesis.json (keep original findings.json intact for report generation)
  8. Set phase="EVALUATE" </phase>
<phase name="EVALUATE"> Calculate metrics:
MetricCalculation
sourcesGatheredfrom state.json (primary gate)
targetSourcesfrom state.json
avgConfidencehigh=3, medium=2, low=1, average all
significantGapsunique gaps across findings

Decision table (two-stage):

Stage 1: Source Gate (MANDATORY)

sourcesGathered >= targetSources→ Action
NoRESEARCH (forced, cannot proceed)
YesContinue to Stage 2

You MUST gather enough sources before considering other criteria.

Stage 2: Quality Gate (only if Stage 1 passes)

avgConfidence >= 2.5 AND gaps <= 2→ Decision
YesSYNTHESIZE
NoRESEARCH (generate follow-ups)

Note: The Stop hook enforces the source gate—you cannot exit until target is met.

If continuing to RESEARCH:

  1. Generate max 4 follow-up questions from gaps/contradictions
  2. Add to questions with status="pending"
  3. Increment iteration
  4. Set phase="RESEARCH"
  5. Log: "Continuing research: {sourcesGathered}/{targetSources} sources, need more to meet target" </phase>
<phase name="SYNTHESIZE"> Write `report.md`:
# {Topic}

## Executive Summary
[300-400 words. Most important finding first. State confidence. Note caveats.]

## Background
[200 words. Key terms. Context.]

## Key Findings

### [Theme 1]
[Grouped findings. Inline citations. Note source strength.]

### [Theme 2]
[3-5 themes total]

## Conflicting Information
[Both sides. Which has better sourcing.]

## Gaps & Limitations
[What's unknown. What needs more research.]

## Source Assessment
- **High confidence:** [claims with 3+ quality sources]
- **Medium confidence:** [claims with 1-2 sources]
- **Low confidence:** [single source or Tier 3 only]

## Sources

### Primary
[Tier 1 sources with URLs]

### Secondary
[Tier 2-3 sources with URLs]

---
*Sources: {sourcesGathered} | Searches: {totalSearches} | Teammates: {teammateCompletions} | Iterations: {iteration} | Duration: {duration} | Date: {date}*

Set phase="DONE".

On completion: The Stop hook will display a resource usage summary when you exit. </phase>

</steps>

<error_handling>

ErrorAction
Malformed JSONRetry once, then mark low confidence
Scrape failsContinue with other URLs
Rate limitWait 60s, reduce batch to 2
No resultsMark low confidence, rephrase as follow-up
Tool not foundFall back to WebFetch, update state.json
</error_handling>
<limits> | Resource | Default | Notes | |----------|---------|-------| | Target sources | 30 | Adjustable in INIT (20-40 based on complexity) | | Teammates per batch | 8 | Parallel research questions (one teammate per question) | | URLs per teammate | 4 | Sources scraped per question | | Follow-ups per iteration | 4 | New questions from gaps |

No hard iteration or search limits. The source gate is the primary constraint. Research continues until sourcesGathered >= targetSources. </limits>

<red_flags> STOP if you catch yourself thinking any of these:

ThoughtReality
"I have high confidence, I can skip the source target"The Stop hook will block you. Gather the sources—it's non-negotiable.
"This topic is too broad for 8 questions"Narrow the scope first. Don't start research on vague topics.
"I'll just synthesize what I have"Check sourcesGathered >= targetSources. If not met, you cannot proceed.
"I don't need to update state.json"You will lose track. Always read/write state.json.
"All sources are equal"Weight Tier 1 sources higher in synthesis.
"I'm stuck, I'll just finish"Narrow the scope or generate better follow-up questions. The source gate is non-negotiable.
</red_flags>
Skills Info
Original Name:researchAuthor:guyathomas