Agent Skill
2/7/2026

claude-code

Use when users ask about Claude Code features, setup, configuration, troubleshooting, slash commands, MCP servers, Agent Skills, hooks, plugins, CI/CD integration, or enterprise deployment. Activate for questions like 'How do I use Claude Code?', 'What slash commands are available?', 'How to set up MCP?', 'Create a skill', 'Fix Claude Code issues', or 'Deploy Claude Code in enterprise'.

K
khang61
0GitHub Stars
1Views
npx skills add Khang61/data-crawler-bot

SKILL.md

Nameclaude-code
DescriptionUse when users ask about Claude Code features, setup, configuration, troubleshooting, slash commands, MCP servers, Agent Skills, hooks, plugins, CI/CD integration, or enterprise deployment. Activate for questions like 'How do I use Claude Code?', 'What slash commands are available?', 'How to set up MCP?', 'Create a skill', 'Fix Claude Code issues', or 'Deploy Claude Code in enterprise'.

Facebook Group Crawler

MVP crawler to extract posts and comments from Facebook Groups you've joined.

Features

  • Cookie-based session persistence (login once, reuse session)
  • Anti-detection measures (random delays, user-agent rotation)
  • Extracts: posts, authors, timestamps, media URLs, reactions, comments
  • Export to JSON (full data) and CSV (flat structure)

Requirements

  • Python 3.11+
  • Facebook account with access to target groups

Quick Start

1. Setup Environment

# Create virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (Linux/Mac)
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Install Playwright browser
playwright install chromium

2. Configure

# Copy example config
cp .env.example .env

# Edit .env with your credentials

.env contents:

FB_EMAIL=your_email@example.com
FB_PASSWORD=your_password
TARGET_GROUP_URL=https://www.facebook.com/groups/your-group-id
HEADLESS=false
TARGET_POSTS=20
MAX_SCROLL_ATTEMPTS=50
REQUEST_DELAY_MIN=2
REQUEST_DELAY_MAX=5
MAX_COMMENT_EXPANDS=20
MAX_COMMENTS_EXTRACT=100

3. Run

python main.py

On first run:

  1. Browser opens Facebook login page
  2. Credentials auto-filled, click login
  3. Complete 2FA if prompted (120s timeout)
  4. Cookies saved for future runs

4. Output

Files saved to data/output/:

  • posts_YYYYMMDD_HHMMSS.json - Full data with comments
  • posts_YYYYMMDD_HHMMSS.csv - Flat post data
  • comments_YYYYMMDD_HHMMSS.csv - Comments with post_id reference

Project Structure

fbbot/
├── src/
│   ├── config.py       # Settings, logging
│   ├── browser.py      # Playwright management
│   ├── auth.py         # FB login, cookies
│   ├── crawler.py      # Group navigation, scrolling
│   ├── extractor.py    # HTML parsing
│   ├── exporter.py     # JSON/CSV export
│   └── models.py       # Post, Comment dataclasses
├── data/
│   ├── cookies/        # Session cookies (gitignored)
│   └── output/         # Crawled data (gitignored)
├── main.py
├── requirements.txt
└── .env.example

Configuration Options

VariableDefaultDescription
FB_EMAIL-Facebook login email
FB_PASSWORD-Facebook password
TARGET_GROUP_URL-Group URL to crawl
HEADLESSfalseRun browser headless
TARGET_POSTS20Number of posts to collect (scroll until reached)
MAX_SCROLL_ATTEMPTS50Maximum scroll attempts to prevent infinite loop
REQUEST_DELAY_MIN2Min delay between actions (sec)
REQUEST_DELAY_MAX5Max delay between actions (sec)
MAX_COMMENT_EXPANDS2Comment expansion clicks (0=top only, -1=all)
MAX_COMMENTS_EXTRACT50Max comments to extract per post

Troubleshooting

Login fails:

  • Check credentials in .env
  • Try HEADLESS=false to see browser
  • Complete 2FA manually if prompted

No posts extracted:

  • Facebook may have changed DOM selectors
  • Check data/crawler.log for errors
  • Try increasing TARGET_POSTS or MAX_SCROLL_ATTEMPTS

Account blocked:

  • Use longer delays (REQUEST_DELAY_MIN=5)
  • Don't run too frequently
  • Use your real account (not fake)

Legal Notice

This tool is for personal/research use only. Crawling Facebook may violate their Terms of Service. Use at your own risk. The authors are not responsible for any consequences.

License

MIT

Skills Info
Original Name:claude-codeAuthor:khang61