search-agent-conversation
Search past agent conversations, chats, prompt history, and agent data (Codex/Claude). Use when user asks to "search agent chat", "search agent conversation", "find conversation", "agent prompt history", "what did agent X do", "query agent data", or needs to look up past agent activity. PREREQUISITE - Always read dump-agent-data-for-rg skill first.
SKILL.md
| Name | search-agent-conversation |
| Description | Search past agent conversations, chats, prompt history, and agent data (Codex/Claude). Use when user asks to "search agent chat", "search agent conversation", "find conversation", "agent prompt history", "what did agent X do", "query agent data", or needs to look up past agent activity. PREREQUISITE - Always read dump-agent-data-for-rg skill first. |
name: search-agent-conversation description: Search past agent conversations, chats, prompt history, and agent data (Codex/Claude). Use when user asks to "search agent chat", "search agent conversation", "find conversation", "agent prompt history", "what did agent X do", "query agent data", or needs to look up past agent activity. PREREQUISITE - Always read dump-agent-data-for-rg skill first.
Search Agent Conversation
[Created by Claude: 7f04f921-3ad9-48b4-a951-f8227c466a3e]
PREREQUISITE
MUST read first: dump-agent-data-for-rg skill for technical details about the dump tool, command flags, and output format.
This skill focuses on search strategies after you've dumped the data.
For Claude Code Agents (Delegation)
When the user asks about past agent activity, ALWAYS delegate to a subagent:
Task(
subagent_type="general-purpose",
model="opus", # Use opus for quality
prompt="""Read the dump-agent-data-for-rg skill, then:
1. Dump agent conversations for the appropriate time range
2. Search for: [extracted keywords from user query]
3. Report findings with session IDs, timestamps, and relevant snippets
User's question: [paste user's original question here]
"""
)
Why delegate: Searching conversations can be intensive. Let a dedicated subagent handle dump+search while you focus on answering the user.
For Codex Agents (Direct Execution)
Codex agents: Follow the workflow below directly (no delegation needed).
🚀 FAST QUERY: --prompt-only Mode
When you only need user prompts (no assistant responses, no tool calls), use --prompt-only:
python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \
--start-hour 72 --end-hour 0 --prompt-only
This outputs JSON to stdout with all sessions and their user prompts:
[
{
"agent": "claude",
"sid": "03d9e752-394b-477b-84d5-b7770322c09f",
"pid": 0,
"prompt_count": 3,
"user_prompts": [
{"t": "2026-01-23 12:00:01", "prompt": "Upgrade the repo..."},
{"t": "2026-01-23 12:15:30", "prompt": "Add tests..."},
...
]
}
]
When to Use --prompt-only vs Full Dump
| Task | Approach |
|---|---|
| "Find agents that worked on X" | --prompt-only + jq/Python filter |
| "Get timeline of user requests" | --prompt-only |
| "List sessions with specific prompt keywords" | --prompt-only |
| "Search assistant responses" | Full dump + rg |
| "Find specific tool calls" | Full dump + rg |
| "Search reasoning/thinking" | Full dump + rg + --include-reasoning |
⚠️ NEVER query SQLite directly when --prompt-only can do the job!
⚠️ When to Regenerate Data
If the user's request obviously requires the most recent info, you MUST:
- Delete old dump and regenerate
Examples requiring fresh data:
- "Check agent X to see if there's any update" (user just asked agent to check something)
- "What did the codex agent do in the last 10 minutes?"
- "Has claude made progress on the task?"
"Latest data" is non-negotiable when inherent in the task. Use your judgment about "how recent" - but err on the side of regenerating. The script is very fast.
Rule of thumb: If in doubt, delete and regenerate. It only takes a few seconds.
Workflow
Step 1: Determine Time Range
From the user's query, infer:
| User Query Pattern | Time Range |
|---|---|
| "recent", "recently", "latest", "just now" | Last 1-2 hours (--start-hour 2 --end-hour 0) |
| "today", "this morning" | Last 24 hours (--start-hour 24 --end-hour 0) |
| "this week", "last few days" | Last 72 hours (--start-hour 72 --end-hour 0) |
| "last N hours/days" | Use specified range |
| No time specified | Default to 24 hours |
Step 2: Delete Old Dump (for "latest" requests)
rm -rf /tmp/agent-dump-conversations
Step 3: Run Dump
See dump-agent-data-for-rg skill for command reference. Quick example:
python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \
--start-hour 24 --end-hour 0 --agent both
Note the output path from the script output.
About --include-reasoning flag: Generally omit reasoning (disabled by default) because:
- Reasoning is transient - agents exploring possibilities, not final decisions
- Often discarded or revised in final output
- Adds noise when searching for what agents actually did
Only use --include-reasoning when debugging agent decision-making process.
Step 4: Generate Search Keywords
From the user's query, extract key terms and generate variations:
Examples
| User Query | Base Keyword | Variations to Search |
|---|---|---|
| "agents who edited extension.js" | extension.js | "extension\.js", "extension.js", "extensions\.js" |
| "agents who wrote rust code" | rust | "rust", "cargo", "\.rs", "rustc", "cargo build" |
| "agents working on auth" | authentication | "auth", "authentication", "login", "session" |
| "agents who compiled" | compile | "compile", "build", "cargo build", "npm run build" |
Strategy
- Start broad - Use multiple related terms
- Include regex escapes -
"extension\.js"(literal dot) - Try plurals/variants - "agent" → "agents", "compile" → "compiled"
- Domain knowledge - "rust" implies "cargo", ".rs files"
- If too many results - Narrow down with more specific patterns
Step 5: Search with rg
# Multiple patterns (OR logic)
rg -e "pattern1" -e "pattern2" -e "pattern3" \
-C 3 \
/tmp/agent-dump-conversations/codex-and-claude-*/
# Case-insensitive when appropriate
rg -i -e "rust" -e "cargo" \
-C 3 \
/tmp/agent-dump-conversations/codex-and-claude-*/
# Count matches first (to gauge result size)
rg -l "keyword" /tmp/agent-dump-conversations/codex-and-claude-*/ | wc -l
Flags to use:
-C 3or-C 5: Include context lines (essential for understanding what agents were doing)-i: Case-insensitive when searching general terms-l: List files only (useful to count sessions before reading content)-e: Multiple patterns with OR logic
Step 6: Collect and Format Results
For each matching session:
- Extract session info from path:
{sid}-{pid}and agent type (codex/claude) - Read relevant snippets (from rg -C output)
- Note timestamps from conversation.txt headers
- Summarize findings in structured format
Suggested report structure:
# Search Results
Query: [user's question]
Time range: [start → end]
Keywords: [what you searched for]
## Summary
- Total sessions: N
- Codex: X sessions
- Claude: Y sessions
## Matches
### Session: codex {sid-suffix} (timestamp)
Path: /tmp/agent-dump-conversations/.../codex/{sid}-{pid}/
Relevant snippet:
[context lines from rg output]
[Repeat for each session, limit to 5-10 most relevant]
Performance Tips
Time Range Selection
- "Latest" queries: 1-6 hours (fast, focused)
- General search: 24 hours (good default)
- Deep search: 72 hours (if user mentions "this week")
- Max practical: 96 hours (beyond this, consider narrowing with --sid)
Agent Filtering
Use --agent codex or --agent claude when user specifies:
- "What did the codex agent do..."
- "Has claude worked on..."
SID Filtering
Use --sid <suffix> when user mentions specific session:
- "Check session abc123..."
- "In that earlier conversation..." (extract sid from context)
Search Pattern Recipes
File Editing Activity
rg -e "edit.*filename" -e "modify.*filename" -e "update.*filename" -C 3
Code Writing by Language
| Language | Patterns |
|---|---|
| Rust | rg -i -e "rust" -e "cargo" -e "\.rs" -e "cargo build" |
| Python | rg -i -e "python" -e "\.py" -e "pip install" -e "pytest" |
| JavaScript | rg -i -e "javascript" -e "\.js" -e "npm" -e "node" |
| TypeScript | rg -i -e "typescript" -e "\.ts" -e "tsc" |
Feature/Task Activity
# Authentication work
rg -i -e "auth" -e "login" -e "session" -e "jwt" -e "oauth"
# Testing activity
rg -i -e "test" -e "pytest" -e "jest" -e "spec" -e "assert"
# Bug fixes
rg -i -e "fix" -e "bug" -e "error" -e "crash" -e "issue"
Build/Compile Activity
rg -i -e "build" -e "compile" -e "cargo build" -e "npm run build" -e "make"
⚠️ IMPORTANT: Direct SQLite Querying Protocol
Agents are heavily discouraged from querying SQLite databases directly, especially for metadata-only queries. Use the dump tool instead.
If You Must Query SQLite Directly
When querying ~/centralized-logs/sqlite-dbs/codex-rounds.sqlite or ~/centralized-logs/sqlite-dbs/claude-rounds.sqlite:
Timeout Rules:
- Maximum timeout: 16 seconds
- On each timeout: Halve the timeout (16s → 8s → 4s → 2s → 1s)
- After multiple timeouts: Stop and report the issue
Background Execution (Recommended):
- Run queries in a background terminal if possible
- Background terminals must also comply with the timeout rule
- Use
timeout 16s sqlite3 ...to enforce limits
Example:
# With timeout protection
timeout 16s sqlite3 ~/centralized-logs/sqlite-dbs/codex-rounds.sqlite "SELECT COUNT(*) FROM sessions"
# If timeout occurs, retry with 8s
timeout 8s sqlite3 ~/centralized-logs/sqlite-dbs/codex-rounds.sqlite "SELECT COUNT(*) FROM sessions"
Why this matters: SQLite can lock under concurrent writes. The dump tool is optimized for safe, read-only access to conversation data.
Important Notes
- Always dump first - Don't skip the dump step, even if you think data might exist
- Use -C context flag - Essential for understanding what agents were doing
- Limit results - Report top 5-10 sessions, not all matches
- Include full paths - User may want to read full conversation.txt files
- For "latest" requests - Always regenerate (delete old dump first)
⚠️ CRITICAL: --sid Flag Uses SUFFIX, Not PREFIX
DO NOT make this mistake:
# WRONG - using the PREFIX of the SID
--sid 019bfbb7 # This will return 0 sessions!
# CORRECT - use the SUFFIX (last 5 characters)
--sid 0ca12 # This works!
Example:
- Full SID:
019bfbb7-16c8-79e3-82e7-f6818860ca12 - Correct
--sidvalue:0ca12(last 5 chars)
Why this matters: If you use the prefix, the dump tool returns 0 sessions with no error, wasting time debugging a non-existent problem.
⚠️ CRITICAL: Use --sid Instead of Dumping Everything
When you have a specific SID, use --sid with a long time window:
# GOOD - targeted, fast, comprehensive (all history for that session)
python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \
--start-hour 720 --end-hour 0 --sid 0ca12
# BAD - dumps ALL sessions then you have to find yours manually
python ~/swe/telemetry_projects/agent_dump/launchers/dump_conversations.py \
--start-hour 24 --end-hour 0 --agent codex
Benefits of --sid approach:
- Low cost: Only dumps data for the specific session
- Comprehensive: Use a very long time window (720h = 30 days) without performance penalty
- Precise: No need to grep through hundreds of sessions