name: deepwiki-skill description: > Generate comprehensive project analysis documentation for code repositories. Use when Claude needs to analyze a codebase and create detailed documentation covering project architecture, component relationships, functionality, code patterns, design decisions, and setup instructions. Uses DeepResearch multi-turn methodology for thorough analysis. Works with any local or remote repository.

DeepWiki Project Analysis

Generate comprehensive, professional documentation for any code repository using systematic analysis methodology inspired by DeepWiki-Open.

When to Use

User asks to "analyze this repository/project/codebase"
User requests "generate documentation for this project"
User wants to "understand how this codebase works"
User needs "project overview and architecture"
User asks "explain the structure of this project"

Output: Multiple Document Collection

This skill generates a COMPREHENSIVE DOCUMENT COLLECTION, not a single report. Each aspect gets its own dedicated document for better organization and maintainability.

Core Analysis Methodology

DeepResearch Multi-Turn Approach

For comprehensive analysis, use a structured multi-turn research process:

Iteration 1: Research Plan

Start with "## Research Plan"
Outline the approach to investigating the topic
Identify key aspects to research
Provide initial findings
End with "## Next Steps"

Iterations 2-4: Progressive Deepening

Start with "## Research Update {N}"
Review conversation history to avoid repetition
Focus on specific aspects needing deeper investigation
Build upon previous findings
Provide new insights not covered before

Iteration 5: Final Conclusion

Start with "## Final Conclusion"
Synthesize ALL findings from previous iterations
Provide comprehensive, definitive answer
Include specific code references and implementation details

Context Retrieval Strategy

When analyzing code:

Start with file path grouping - Group context by file paths for better organization
Prioritize implementation files - Focus on non-test files first
Use targeted queries - Formulate RAG queries to focus on specific aspects
Manage token limits - If context exceeds ~8000 tokens, be selective

Context formatting:

## File Path: path/to/file.py

<code content>

---

## File Path: path/to/another.py

<code content>

Token Management

Guidelines:

Safe threshold: ~7500 tokens for input
Warning: If input exceeds 8000 tokens, reduce context
Fallback: If too large, answer without retrieval augmentation
Prioritize: Core implementation files over tests and docs

Applying DeepResearch by Project Size

For Small Projects (<10k LOC):

Iterations: 2-3 iterations sufficient
Iteration 1: Research Plan - understand overall structure
Iteration 2: Final Conclusion - generate all documentation
Focus: Quick overview, essential documentation only

For Medium Projects (10k-50k LOC):

Iterations: 3-4 iterations recommended
Iteration 1: Research Plan - identify key components
Iteration 2: Deep dive into core modules
Iteration 3: Explore component relationships
Iteration 4: Final Conclusion - comprehensive documentation
Focus: Balanced detail, all core documents

For Large Projects (50k-100k LOC):

Iterations: 4-5 iterations recommended
Iteration 1: Research Plan - architecture overview
Iteration 2: Deep dive into architecture and design patterns
Iteration 3: Explore specific subsystems (e.g., API, UI, data layer)
Iteration 4: Investigate advanced features (testing, deployment, configuration)
Iteration 5: Final Conclusion - full documentation collection
Focus: Detailed analysis with specialized documents

For Very Large Projects (>100k LOC):

Iterations: 5+ iterations with focused deep-dives
Iteration 1: Research Plan - high-level architecture
Iteration 2: Architecture and component relationships
Iteration 3: Core functionality deep-dive (Task, Controller, API Handler)
Iteration 4: Specialized systems (MCP, Tools, Context Management)
Iteration 5: Advanced topics (Testing, Deployment, Configuration, Troubleshooting)
Iteration 6: Final Conclusion - complete 10-document collection
Focus: Maximum detail, extensive cross-references, multiple diagrams
Note: May require multiple conversation sessions due to context limits

When to Use Explore Agent

Use Task tool with subagent_type=Explore when:

Need to understand overall codebase structure
Searching for patterns across many files
Investigating how components interact
Need comprehensive analysis of large code sections

Use direct Grep/Glob/Read when:

Looking for specific file patterns (e.g., all test files)
Searching for specific keywords (e.g., "interface Task")
Reading known specific files
Quick verification tasks

Analysis Framework

Project Size Assessment

Before generating documentation, assess the project size:

Small Project (< 10k LOC):
- Generate: README.md, ARCHITECTURE.md, DEVELOPMENT.md
- Optional: API.md (if applicable)
- Focus: Essential documentation only

Medium Project (10k-50k LOC):
- Generate: All 5 core documents
- Optional: COMPONENTS.md, DATA-MODELS.md
- Focus: Balanced detail level

Large Project (50k-100k LOC):
- Generate: All 5 core documents + at least 2 specialized documents
- Recommended: COMPONENTS.md, DATA-MODELS.md, CONFIGURATION.md
- Focus: Comprehensive coverage

Very Large Project (>100k LOC):
- Generate: All 10 documents (5 core + 5 specialized)
- Required: COMPONENTS.md, DATA-MODELS.md, CONFIGURATION.md, TESTING.md
- Recommended: DEPLOYMENT.md, TROUBLESHOOTING.md
- Focus: Maximum detail with multiple specialized documents

Project Size Indicators:

Lines of code (LOC) - Use find and wc -l to count
Number of files - More files = more complex
Directory depth - Deeper hierarchy = more complex
Number of dependencies - More dependencies = more complex

How to Count LOC

For TypeScript/JavaScript projects:

# Count all TS/JS files (excluding tests)
find src -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" | xargs wc -l | tail -1

# Or use a more comprehensive count
find src -name "*.ts" -o -name "*.tsx" | xargs wc -l

For Python projects:

# Count all Python files in src/
find src -name "*.py" | xargs wc -l | tail -1

For general projects (all source files):

# Count all code files (excluding tests, node_modules, build, etc.)
find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.py" -o -name "*.go" -o -name "*.rs" \) ! -path "*/node_modules/*" ! -path "*/dist/*" ! -path "*/build/*" ! -path "*/test/*" ! -path "*/tests/*" ! -path "*/.next/*" | xargs wc -l | tail -1

Quick Project Size Assessment

Step 1: Count files and estimate LOC

# Count source files
find src -type f -name "*.ts" -o -name "*.tsx" -o -name "*.py" -o -name "*.js" | wc -l

# List directory structure
tree -L 2 -d

Step 2: Check package.json/requirements.txt for dependencies

# Count npm dependencies
cat package.json | jq '.dependencies | length'

# Count Python dependencies
cat requirements.txt | wc -l

Step 3: Make decision based on findings

< 50 files, < 10k LOC → Small project
50-200 files, 10k-50k LOC → Medium project
200-500 files, 50k-100k LOC → Large project
500+ files, >100k LOC → Very large project

Phase 1: Project Discovery (5-15 minutes)

1. Understand Project Scope

- Identify project type (web app, library, API, tool, etc.)
- Determine primary programming languages
- Estimate project size and complexity
- Identify key frameworks and technologies

2. Locate Entry Points

Priority files to examine:
- README.md, CONTRIBUTING.md
- package.json, requirements.txt, pom.xml, go.mod
- main.py, index.js, app.py, main.go
- Dockerfile, docker-compose.yml

3. Initial Structure Assessment

- List top-level directories
- Identify configuration files
- Note build/deployment files
- Check for documentation folder

Phase 2: Architecture Analysis (15-30 minutes)

1. Directory Structure Analysis

For each directory:
- Understand its purpose
- Identify key files
- Note dependencies on other directories
- Document architecture pattern (MVC, layered, microservices, etc.)

2. Component Identification

- List major components/modules
- Map component relationships
- Identify data flow between components
- Note external service integrations

3. Design Pattern Recognition

Common patterns to identify:
- MVC/MVVM
- Repository pattern
- Factory/Builder pattern
- Observer/Pub-Sub
- Dependency Injection
- Middleware pipeline
- Service layer

Phase 3: Deep Code Analysis (30-60 minutes)

1. Core Functionality Analysis

For each major component:
- Understand primary purpose
- Identify key classes/functions
- Trace execution flow
- Note important algorithms
- Document edge cases handled

2. Data Model Analysis

- Identify data structures
- Map entity relationships
- Note data validation rules
- Understand data flow and transformations

3. Dependencies Analysis

External dependencies:
- List major libraries/packages
- Understand their purposes
- Note version constraints

Internal dependencies:
- Map module imports
- Identify circular dependencies
- Note coupling levels

Phase 4: Documentation Generation (30-120 minutes)

Generate a DOCUMENT COLLECTION based on project size:

For All Projects (Core Documents - Always Generate)

README.md - Project overview and quick start
ARCHITECTURE.md - System architecture and design
DEVELOPMENT.md - Development setup and guidelines

Conditional Documents (Generate based on project type)

API.md - API reference (if project has API endpoints or programming interface)
DEPLOYMENT.md - Deployment and operations (if project has deployment process)

For Medium+ Projects (Recommended for 10k+ LOC)

COMPONENTS.md - Component catalog and details
DATA-MODELS.md - Data schemas and relationships

For Large+ Projects (Recommended for 50k+ LOC)

CONFIGURATION.md - Configuration options (if project has extensive configuration)
TESTING.md - Testing strategy and guidelines

For Very Large Projects (Required for 100k+ LOC)

TROUBLESHOOTING.md - Common issues and solutions

Document Generation Decision Matrix:

Project Size → Documents Generated
─────────────────────────────────────
Small (< 10k) → README.md, ARCHITECTURE.md, DEVELOPMENT.md (+ API.md if applicable)
Medium (10k-50k) → Core 5 docs: README, ARCHITECTURE, API (if exists), DEVELOPMENT, DEPLOYMENT
              → Add: COMPONENTS.md, DATA-MODELS.md
Large (50k-100k) → Core 5 docs + COMPONENTS.md, DATA-MODELS.md
              → Add: CONFIGURATION.md, TESTING.md
Very Large (>100k) → All 10 documents (5 core + 5 specialized)

Documentation Best Practices for Large Projects:

Use Progressive Disclosure - Start with overview, then dive deeper
Cross-Reference Extensively - Link related documents
Include Code Examples - Show actual usage patterns
Add Mermaid Diagrams - Visualize architecture and relationships
Provide Multiple Levels - Overview → Detailed → Reference

Each document is a separate markdown file, cross-referenced where appropriate.

Prompt Engineering Patterns

First Iteration Prompt Template

You are an expert code analyst examining the {repo_type} repository: {repo_url}.
You are conducting a multi-turn Deep Research process to thoroughly investigate the specific topic in the user's query.
Your goal is to provide detailed, focused information EXCLUSIVELY about this topic.

<guidelines>
- This is the first iteration of a multi-turn research process
- Start your response with "## Research Plan"
- Outline your approach to investigating this specific topic
- Identify the key aspects you'll need to research
- Provide initial findings based on the information available
- End with "## Next Steps" indicating what you'll investigate next
- Do NOT provide a final conclusion yet
- Focus EXCLUSIVELY on the specific topic being researched

Intermediate Iteration Prompt Template

You are in iteration {N} of a Deep Research process focused EXCLUSIVELY on the user's query.
Your goal is to build upon previous research iterations and go deeper into this specific topic.

<guidelines>
- CAREFULLY review the conversation history to understand what has been researched
- Your response MUST build on previous research - do not repeat information
- Identify gaps or areas that need further exploration
- Focus on one specific aspect that needs deeper investigation
- Start your response with "## Research Update {N}"
- Provide new insights that weren't covered in previous iterations

Final Iteration Prompt Template

You are in the final iteration of a Deep Research process.
Your goal is to synthesize all previous findings and provide a comprehensive conclusion.

<guidelines>
- CAREFULLY review the entire conversation history
- Synthesize ALL findings from previous iterations
- Start with "## Final Conclusion"
- Your conclusion MUST directly address the original question
- Include specific code references and implementation details
- Provide a complete and definitive answer to the original question

Code Reading Strategy

Top-Down Approach:

1. Start with README and documentation
2. Examine configuration files
3. Look at main entry points
4. Study directory structure
5. Dive into specific modules
6. Trace execution flows

What to Focus On:

Priority order:
- Public APIs and interfaces
- Core business logic
- Data models and schemas
- Configuration and setup
- Utilities and helpers
- Tests (for usage examples)

Pattern Recognition

Web Application Indicators:

- package.json with React/Vue/Angular
- static/ or public/ directory
- routes/, controllers/, views/
- middleware/, templates/

API Service Indicators:

- api/, routes/ directories
- FastAPI, Express, Flask imports
- OpenAPI/Swagger specs
- HTTP method decorators

Library/Package Indicators:

- lib/, src/ directories
- setup.py, pyproject.toml
- minimal main/entry point
- extensive __init__.py files

Microservice Indicators:

- docker-compose.yml
- service-specific directories
- API gateway or proxy
- message queue integration

Document Generation Structure

Document Collection Organization

Create a docs/ directory with separate markdown files:

docs/
├── README.md                 # Project overview (entry point)
├── ARCHITECTURE.md           # System architecture
├── API.md                    # API documentation
├── COMPONENTS.md             # Component catalog
├── DATA-MODELS.md            # Data schemas
├── DEVELOPMENT.md            # Development guide
├── DEPLOYMENT.md             # Deployment guide
├── CONFIGURATION.md          # Configuration
├── TESTING.md                # Testing guide
└── TROUBLESHOOTING.md        # Common issues

Cross-References

Link between documents:

See [ARCHITECTURE.md](ARCHITECTURE.md) for system design.
See [API.md](API.md) for endpoint documentation.
See [DEVELOPMENT.md](DEVELOPMENT.md) for setup instructions.

Document Templates

Use templates from assets/document_templates/ for each document type.

Core Documents (5 templates):

README.md - Project overview template
ARCHITECTURE.md - Architecture template
API.md - API documentation template
DEVELOPMENT.md - Development guide template
DEPLOYMENT.md - Deployment guide template

Specialized Documents (5 templates):

COMPONENTS.md - Component catalog template
DATA-MODELS.md - Data models and schemas template
CONFIGURATION.md - Configuration guide template
TESTING.md - Testing guide template
TROUBLESHOOTING.md - Troubleshooting guide template

Template Usage Guidelines

For Small Projects (<10k LOC):

Use core templates: README, ARCHITECTURE, DEVELOPMENT
Add API template if project has API endpoints
Keep documentation concise and focused

For Medium Projects (10k-50k LOC):

Use all 5 core templates
Add COMPONENTS and DATA-MODELS for better organization
Include moderate level of detail

For Large Projects (50k-100k LOC):

Use all 5 core templates + COMPONENTS + DATA-MODELS
Add CONFIGURATION and TESTING for comprehensive coverage
Include detailed explanations with multiple code examples

For Very Large Projects (>100k LOC):

Use all 10 templates (5 core + 5 specialized)
Maximum detail with extensive cross-references
Include multiple levels: overview → detailed → reference
Add Mermaid diagrams for visualization

Diagram Guidelines

Create Mermaid diagrams for:

System architecture (graph TD)
Component relationships (graph LR)
Data flow (sequenceDiagram)
Entity relationships (erDiagram)
State machines (stateDiagram-v2)

Diagram Best Practices:

Keep diagrams simple (max 10-15 nodes)
Use consistent styling
Label edges clearly
Group related components
Use subgraphs for layers

Handling Different Project Types

Web Applications

Focus on:
- Frontend framework and routing
- Backend API structure
- Database schema
- Authentication/authorization
- State management

API Services

Focus on:
- Endpoint definitions
- Request/response models
- Middleware and validation
- Authentication
- Rate limiting

Libraries/Packages

Focus on:
- Public API surface
- Core functionality
- Usage examples
- Configuration options
- Extension points

Data Projects

Focus on:
- Data pipeline stages
- Processing logic
- Storage schemas
- ETL/ELT flows
- Quality checks

Common Pitfalls to Avoid

Don't:

List every file exhaustively
Include generated/minified files
Get lost in implementation details
Ignore configuration files
Skip the README
Assume without verifying

Do:

Focus on understanding over listing
Identify patterns and conventions
Trace key execution paths
Document non-obvious decisions
Cross-reference with documentation
Verify assumptions with code

Quality Checklist

Before finalizing documentation, ensure:

Reference Resources

Analysis Framework: references/analysis_framework.md - Detailed analysis methodology
Code Patterns: references/code_patterns.md - Common patterns recognition
Documentation Standards: references/documentation_standards.md - Writing guidelines
Document Templates: assets/document_templates/ - Document-specific templates

Tips for Effective Analysis

Speed vs Depth:

Quick overview: 30-60 minutes
Detailed analysis: 2-4 hours
Comprehensive: 4-8 hours

When to Deep Dive:

Complex business logic
Unusual architecture
Critical algorithms
Security-sensitive code

When to Stay High-Level:

Standard patterns
Boilerplate code
Well-documented libraries
Configuration files

When to Use DeepResearch:

User requests comprehensive analysis
Topic requires multiple aspects to be covered
Question is complex or multi-faceted
User asks for "deep dive" or "thorough analysis"

Working with Large Projects

For large projects (>1000 files or >100k LOC):

Strategy 1: Modular Deep-Dive

First Pass - Architecture Overview
- Use Explore agent to understand overall structure
- Identify major subsystems and components
- Map out component relationships
- Generate ARCHITECTURE.md first
Second Pass - Core Functionality
- Focus on main entry points (e.g., main.py, index.ts, app.py)
- Trace request flow from entry point to completion
- Identify core business logic
- Generate COMPONENTS.md for core components
Third Pass - Specialized Systems
- Investigate specific subsystems based on project type:
  - API projects: endpoints, middleware, authentication
  - Web apps: routing, state management, data flow
  - Libraries: public API, extension points, examples
- Generate specialized documents (API.md, DATA-MODELS.md, etc.)
Fourth Pass - Supporting Systems
- Configuration management
- Testing infrastructure
- Deployment setup
- Troubleshooting guides

Strategy 2: Layer-by-Layer Analysis