Agent Skill
2/7/2026

slide-generator

This skill generates professional presentation slides from any content using Google's Gemini image generation API. Use this skill when users ask to create slides, presentations, slide decks, or visual summaries of content. It analyzes source material, designs cohesive narratives with full slide content, and generates visually coherent slide images.

L
lfiaschi
0GitHub Stars
1Views
npx skills add lfiaschi/superskills

SKILL.md

Nameslide-generator
DescriptionThis skill generates professional presentation slides from any content using Google's Gemini image generation API. Use this skill when users ask to create slides, presentations, slide decks, or visual summaries of content. It analyzes source material, designs cohesive narratives with full slide content, and generates visually coherent slide images.

name: slide-generator description: This skill generates professional presentation slides from any content using Google's Gemini image generation API. Use this skill when users ask to create slides, presentations, slide decks, or visual summaries of content. It analyzes source material, designs cohesive narratives with full slide content, and generates visually coherent slide images.

Slide Generator

Generate professional slides from any content using a two-pass approach: story first, then visual design.

Before You Start

# Verify API key is set
[ -n "$GEMINI_API_KEY" ] && echo "Ready" || echo "Set GEMINI_API_KEY first"

If not set, tell the user to run export GEMINI_API_KEY='...' and stop.


Architecture

This skill uses two specialized agents in sequence, with context discovery upfront:

┌──────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                                          │
│   INPUT            CLARIFY           PASS 1             PASS 2             OUTPUT        │
│   ─────            ───────           ──────             ──────             ──────        │
│                                                                                          │
│   Source      ┌───────────┐    ┌──────────────┐   ┌──────────────┐   ┌──────────────┐   │
│   Material    │  Context  │    │ Story Agent  │   │ Design Agent │   │   Gemini     │   │
│   ─────────►  │  Questions│ ─► │              │ ─►│              │ ─►│   Image      │   │
│               │  (≤5)     │    │ Storyboard & │   │ Visual       │   │   Generation │   │
│               └───────────┘    │ Content      │   │ Design       │   │              │   │
│                    │           └──────────────┘   └──────────────┘   └──────────────┘   │
│                    ▼                  │                  │                  │           │
│               [User Input]       story.md         design_system.md      slides/         │
│                                                   visual_specs.md       ├─ slide_01.png │
│                                       │           prompts.json          └─ slides.pdf  │
│                                       ▼                  │                              │
│                                [User Review]      [User Review]                         │
│                                                                                          │
└──────────────────────────────────────────────────────────────────────────────────────────┘

Why this flow?

  • Context first — Understanding goals shapes everything else
  • Separation of concerns — Each agent does one thing well
  • Better iteration — Change the story without redoing visuals, or restyle without rewriting

The Process

0. CLARIFY  →  1. STORY AGENT  →  2. REVIEW  →  3. DESIGN AGENT  →  4. REVIEW  →  5. GENERATE  →  6. REVIEW
   ───────      ───────────        ──────        ────────────        ──────        ────────        ──────
   Context      Storyboard &       User          Visual system       User          Create          Fix
   questions    content            approves      & specifications    approves      images          issues
   (≤5)                            story                             design

0. Context Discovery

Goal: Understand the presentation context before creating content.

After receiving source material, do a quick scan to understand what's there, then ask the user up to 5 clarifying questions about context.

What to Scan For

Quickly identify:

  • What type of content is this? (data, research, proposal, report, etc.)
  • What's the apparent subject matter?
  • How much detail/data is available?

Questions to Ask

Ask only what you need — skip questions where the answer is obvious from context. Maximum 5 questions total.

Question bank (choose relevant ones):

CategoryQuestionWhy It Matters
Event/Setting"Where will this be presented? (conference talk, board meeting, team sync, async document, etc.)"Formality, depth, tone
Goal"What's the primary goal? (inform/educate, persuade/sell, enable a decision, document for reference)"Shapes the entire structure
Presentation Mode"Will this be presented live or read independently?"Live = bold points, speaker fills gaps. Read = detailed, self-contained
Density"Should slides be dense with details or minimal with key points?"Affects how much content per slide
Audience"Who's the audience and what do they already know about this topic?"Technical depth, context needed
Constraints"Any constraints? (number of slides, time limit, must-include elements)"Hard requirements
Tone"What tone fits best? (formal, conversational, urgent, neutral)"Voice throughout

Question Format

Before I create the storyboard, I have a few questions:

1. **[Question]**
   *If you skip this, I'll assume: [default]*

2. **[Question]**
   *If you skip this, I'll assume: [default]*

[up to 5 questions]

Feel free to answer briefly or skip any — I'll use sensible defaults.

Defaults (When User Doesn't Answer)

QuestionDefault Assumption
Event/SettingProfessional meeting
GoalInform/educate
Presentation ModePresented live
DensityBalanced (not too dense, not too sparse)
AudienceKnowledgeable but needs context
Constraints~10-15 slides
ToneProfessional, neutral

Wait for user response before proceeding to Story Agent.


1. Story Agent: Storyboard & Content

Goal: Transform source material into a well-structured storyboard, tailored to the context from Step 0.

Launch the agent:

Use the Task tool to spawn the story-agent subagent:

  • subagent_type: Reference the story-agent defined in agents/story_agent.md
  • prompt: Include:
    1. The user's source material
    2. The context answers from Step 0 (event, goal, mode, density, audience, constraints, tone)
    3. Instruction to save output to story.md

What the Story Agent does:

  1. Evidence extraction — Inventories all data, facts, and claims from the source
  2. Structure selection — Chooses approach based on the GOAL (inform → explanatory, persuade → argumentative, decide → comparative, document → comprehensive)
  3. Slide content — Writes each slide with appropriate density for the MODE
  4. Quality checks — Ensures content matches the stated goal and audience

How context shapes the output:

ContextImpact on Storyboard
Goal: InformNeutral presentation of facts, no "call to action," balanced perspective
Goal: PersuadeClear thesis, supporting evidence, addresses objections, ends with ask
Goal: DecideOptions laid out fairly, pros/cons, recommendation with reasoning
Goal: DocumentComprehensive, self-contained, detailed for future reference
Mode: PresentedBold headlines, 3-4 points max per slide, speaker notes carry detail
Mode: ReadMore text per slide, self-explanatory, annotations and context included
Density: HighData-rich, tables, detailed breakdowns
Density: LowKey messages only, one idea per slide

Story Agent output: story.md

# [Presentation Title]

## Context
- Event: [from user]
- Goal: [from user]
- Mode: [presented/read]
- Density: [high/balanced/low]
- Audience: [from user]
- Constraints: [from user]
- Tone: [from user]

## Executive Summary
[What this deck covers and why]

## Evidence Inventory
[Data and facts extracted from source]

## Structure
[Chosen approach and why it fits the goal]

## Slides
[All slides with: title, content, data points, speaker notes (if presented mode)]

Key constraint: The Story Agent writes NO visual instructions. It may note inherently visual content ("this is a comparison of 4 options") but doesn't prescribe how to show it.


2. Story Review: User Gate

This is a hard stop. Present the story to the user before proceeding.

## Story Ready for Review

I've developed a [N]-slide narrative with the following arc:

**Thesis:** [Core message in one sentence]
**Arc:** [Structure description]
**The Turn:** Slide [N] — [What shifts]
**The Peak:** Slide [N] — [Highest impact moment]

### Slide Sequence
1. [Slide 1 title] — [Purpose]
2. [Slide 2 title] — [Purpose]
...

[FULL STORY.MD CONTENT]

---

**Please review and let me know:**
1. Approve → I'll proceed to visual design
2. Content changes → Tell me what to adjust
3. Restructure → Tell me what's not working with the narrative

Do not proceed to Design Agent until user explicitly approves the story.


3. Design Agent: Visual System & Specifications

Goal: Transform approved content into detailed visual specifications.

Launch the agent:

Use the Task tool to spawn the design-agent subagent:

  • subagent_type: Reference the design-agent defined in agents/design_agent.md
  • prompt: Include:
    1. The approved story.md content
    2. Instruction to save outputs to:
      • design_system.md (visual language)
      • visual_specs.md (per-slide specifications)
      • prompts.json (ready for generation)

What the Design Agent does:

  1. Design system — Defines colors (with semantic meaning), typography personality, spacing, visual elements, signature details
  2. Per-slide design — Information hierarchy, visual concept/metaphor, composition, detailed specification
  3. Visual rhythm — Plans variety (Heavy/Medium/Light/Dramatic), ensures pacing
  4. Prompt construction — Creates generation-ready prompts with design system embedded

Design Agent outputs:

design_system.md

## Design System

### Mood & Personality
[Visual feeling description]

### Color Palette
- Primary Background: [hex + usage]
- Accent 1: [hex + semantic meaning]
- ...

### Typography
[Personality descriptions, not font names]

### Visual Elements
[Shapes, lines, icons, illustration style]

### Signature Elements
[1-2 distinctive details]

visual_specs.md

## Slide [N]: [Title]

### Information Hierarchy
1. [Primary]
2. [Secondary]
3. [Tertiary]

### Visual Concept
[Why this approach serves the content]

### Visual Specification
[Detailed description: composition, elements, colors, spacing, micro-details]

### Visual Weight
[Heavy/Medium/Light/Dramatic]

prompts.json

[
  {
    "name": "slide_01_title",
    "prompt": "Create a [MOOD] presentation slide, 16:9 aspect ratio, 4K resolution.\n\nDESIGN LANGUAGE:\n[Design system summary with hex codes]\n\nCONTENT:\nTitle: \"[Exact title]\"\n[Other content]\n\nVISUAL COMPOSITION:\n[Detailed specification]\n\nKEY DETAILS:\n[Critical elements]"
  }
]

Key principle: Each prompt carries the design system DNA (colors, typography, mood) so visual consistency is maintained even without reference images — but reference images provide additional reinforcement.


4. Design Review: User Gate

This is a hard stop. Present the visual approach to the user before generating.

## Visual Design Ready for Review

I've designed a visual system for [N] slides:

**Mood:** [Description]
**Color Palette:** [Key colors and their meanings]
**Visual Rhythm:** [How the deck flows]

### Design System Summary
[Key elements from design_system.md]

### Slide Designs

**Slide 1: [Title]**
- Visual weight: [Heavy/Medium/Light/Dramatic]
- Concept: [Brief description of approach]

**Slide 2: [Title]**
- Visual weight: [...]
- Concept: [...]

[Continue for all slides]

### Full Specifications
[VISUAL_SPECS.MD CONTENT]

---

**Please review and let me know:**
1. Approve → I'll generate the slide images
2. Specific changes → Tell me which slides to redesign
3. System changes → Tell me what to adjust in the overall visual approach

Do not proceed to Generate until user explicitly approves the design.


5. Generate: Create Slide Images

Prerequisites

Ensure you have:

  • prompts.json from the Design Agent
  • GEMINI_API_KEY environment variable set

Generate Slides

python scripts/generate_slides.py \
  --api-key "$GEMINI_API_KEY" \
  --prompts-file prompts.json \
  --output-dir ./slides \
  --model nano-banana-2 \
  --reference-strategy anchor

Model selection:

  • nano-banana-2 — Pro-level quality at flash speed, half the cost of pro (recommended default)
  • pro — Highest fidelity, best for text-heavy slides requiring maximum accuracy
  • flash — Cheapest option, use for quick drafts only

Reference strategy:

  • anchor — Slide 1 always included + previous slide (RECOMMENDED)
  • progressive — Each slide references only previous (causes drift — avoid)
  • none — No references (each slide independent)

The anchor strategy ensures visual consistency:

  • Slide 1: No reference (establishes the visual identity from the prompt)
  • Slide 2: References slide 1
  • Slide 3+: References slide 1 (style anchor) + previous slide (layout continuity)

Assemble PDF (Optional)

python scripts/generate_slides.py \
  --output-dir ./slides \
  --assemble

6. Review: Quality Check & Iteration

Visual Quality Checks

CheckWhat to Look ForFix
Text renderingIs all text legible and correct?Regenerate with pro model; simplify text in prompt
Color consistencyDoes the palette match across slides?Check reference strategy; verify hex codes in prompts
Layout accuracyDoes composition match specification?Add more explicit positioning in prompt
Visual weightDoes rhythm feel right?Adjust specifications for variety
Style driftDo later slides match earlier ones?Ensure slide 1 is always in reference chain

The Flip Test

View all slides as thumbnails:

  • Can you follow the argument from titles alone?
  • Is there visual variety within consistency?
  • Does the eye know where to go on each slide?

The Presentation Test

Click through as if presenting:

  • Does each transition feel earned?
  • Would you need to say "so anyway..." anywhere?
  • Do data visualizations communicate clearly?

Iteration

If slides need changes:

  1. Content issues → Go back to Story Agent, update story.md
  2. Visual concept issues → Go back to Design Agent, update specifications
  3. Rendering issues → Refine prompts and regenerate specific slides

To regenerate a single slide:

python scripts/generate_slides.py \
  --api-key "$GEMINI_API_KEY" \
  --prompt "[Updated prompt for single slide]" \
  --output ./slides/slide_03.png \
  --model nano-banana-2

File Structure

project/
├── input/                    # User's source material
│   └── brief.md              # Or transcript.txt, notes.docx, etc.
├── story.md                  # Pass 1 output: narrative & content
├── design_system.md          # Pass 2 output: visual language
├── visual_specs.md           # Pass 2 output: per-slide specifications
├── prompts.json              # Pass 2 output: generation-ready prompts
├── slides/                   # Generated images
│   ├── slide_01_title.png
│   ├── slide_02_problem.png
│   ├── ...
│   └── slides.pdf            # Assembled deck
└── scripts/
    └── generate_slides.py    # Image generation script

Quick Reference

Subagents

NameFilePurpose
story-agentagents/story_agent.mdAnalyzes source material, creates storyboard with slide content
design-agentagents/design_agent.mdCreates design system and visual specifications for each slide

API

Endpoint: POST https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent

Models:
- gemini-3.1-flash-image-preview (Nano Banana 2 — pro quality at flash speed, recommended)
- gemini-2.5-flash-image (fast, cheap drafts)
- gemini-3-pro-image-preview (highest fidelity, text-heavy)

Config:
generation_config = {"responseModalities": ["IMAGE"]}

Troubleshooting

IssueCauseFix
Text not renderingModel limitationUse pro model; reduce text density
Style driftReference chain brokenUse anchor strategy; always include slide 1
Colors inconsistentHex codes varyVerify exact hex codes in all prompts
Layout wrongVague specificationAdd explicit proportions and positions
Visuals too similarNo rhythm planningVary visual weight across slides
Prompts too longOver-specificationFocus on key elements; trust the model

Design System Checklist

Before generating, verify design_system.md includes:

  • Color palette with hex codes AND semantic meanings
  • Typography personality (not font names)
  • Spacing philosophy
  • Visual element style (shapes, lines, icons)
  • Signature elements for recognition

Prompt Quality Checklist

Before generating, verify each prompt in prompts.json includes:

  • Mood and aspect ratio
  • Design language summary (colors, typography, style)
  • Exact text content (verbatim from story.md)
  • Composition with proportions
  • Key details for critical elements
Skills Info
Original Name:slide-generatorAuthor:lfiaschi