Agent Skill
2/7/2026

simple-pdf-skill

PDF processing skill for creating, editing, extracting, and merging PDFs using Python libraries.

A
aiguozhi123456
0GitHub Stars
1Views
npx skills add aiguozhi123456/simple-PDF-skill

SKILL.md

Namesimple-pdf-skill
DescriptionPDF processing skill for creating, editing, extracting, and merging PDFs using Python libraries.

name: simple-pdf-skill description: PDF processing skill for creating, editing, extracting, and merging PDFs using Python libraries.

Use when:

  • Creating new PDFs from data (reports, invoices, receipts)
  • Editing existing PDFs (adding highlights, annotations, watermarks, images)
  • Extracting text, tables, or images from PDFs
  • Merging, splitting, or manipulating PDF pages
  • Adding Chinese text to PDFs (requires font registration)
  • Batch processing PDF documents
  • Converting PDFs to images
  • Password protection or encryption

NOT suitable for:

  • Complex HTML/CSS to PDF conversion (use WeasyPrint or pdfkit)
  • Rich media/interactive PDFs (use commercial libraries)
  • High-performance vector graphics

Simple PDF Skill

Quick guide for PDF processing using Python libraries.

Library Selection Guide

Choose the right library based on your task:

TaskLibraryGuide
Create new PDFsreportlabreportlab-guide.md
Edit existing PDFsPyMuPDF (fitz)pymupdf-guide.md
Extract text/tablespdfplumber or PyMuPDFpdfplumber-guide.md
Merge/split PDFsPyMuPDF or pypdfpymupdf-guide.md
Add annotationsPyMuPDFpymupdf-guide.md
Extract imagesPyMuPDFpymupdf-guide.md
Render to imagespypdfium2pypdfium2-guide.md
Password protectionpypdfpypdf-guide.md
Generate chartsmatplotlib + reportlabchart-guide.md

Quick Start Workflow

1. Identify the Task Type

Creating PDFs:

Editing Existing PDFs:

  • Use PyMuPDF (fitz) for any modifications
  • Common edits: highlights, annotations, watermarks, merging, splitting
  • See pymupdf-guide.md

Extracting Content:

2. Special Considerations

Chinese Text Support:

  • CRITICAL: Default fonts do not support Chinese
  • Must register Chinese font before use in reportlab
  • See reportlab-guide.md → Chinese Font Support section
  • Recommended fonts: WQY Microhei (4.4MB), Noto Sans SC (15MB)

Performance:

  • For large PDFs, process in chunks
  • Use fitz (PyMuPDF) for best performance on editing tasks
  • Use pdfplumber for reliable text extraction

3. Implementation Reference

For implementation patterns and examples:

Installation

Install required libraries:

pip install reportlab
pip install pymupdf
pip install pdfplumber
pip install pypdf
pip install pypdfium2
pip install fonttools  # For TTF font extraction

For advanced features (OCR, CLI tools):

# For OCR (scanned PDFs)
pip install pytesseract pdf2image
sudo apt-get install tesseract-ocr

# For command-line tools
sudo apt-get install poppler-utils
sudo apt-get install qpdf

Key Rules

  • reportlab: Canvas coordinates (0,0 at bottom-left), use Pt() for font sizes, Inch() for positioning
  • PyMuPDF: Uses RGB tuples (0-1 range), not 0-255
  • Always: Call save() to finalize documents, close documents to free resources
  • Chinese fonts: ALWAYS register Chinese fonts before using Chinese text in reportlab
  • Large PDFs: Process in chunks to avoid memory issues
  • Encrypted PDFs: Handle gracefully with proper password management
Skills Info
Original Name:simple-pdf-skillAuthor:aiguozhi123456