Agent Skill
2/7/2026

tinybird-analytics

Tinybird real-time analytics - datasources, pipes, endpoints, and ClickHouse best practices. Use when building analytics backends or real-time data pipelines.

A
allanninal
0GitHub Stars
1Views
npx skills add allanninal/claude-code-skills

SKILL.md

Nametinybird-analytics
DescriptionTinybird real-time analytics - datasources, pipes, endpoints, and ClickHouse best practices. Use when building analytics backends or real-time data pipelines.

name: tinybird-analytics description: Tinybird real-time analytics - datasources, pipes, endpoints, and ClickHouse best practices. Use when building analytics backends or real-time data pipelines.

Tinybird Real-Time Analytics

When to Use This Skill

  • Building real-time analytics backends
  • Creating API endpoints from SQL queries
  • Processing streaming data
  • Building dashboards and metrics
  • Working with ClickHouse SQL

Project Structure

tinybird/
├── datasources/
│   ├── events.datasource       # Raw event ingestion
│   └── users.datasource        # User dimension table
├── pipes/
│   ├── events_per_day.pipe     # Aggregation pipe
│   └── user_activity.pipe      # API endpoint
├── tests/
│   └── pipes/
│       └── events_per_day.yaml # Pipe tests
├── .tinyb                      # Auth token (gitignored)
└── .gitignore

Datasources

Event Stream Datasource

# datasources/events.datasource

DESCRIPTION >
    Raw events from application

SCHEMA >
    `event_id` String,
    `event_type` String,
    `user_id` String,
    `properties` String,  # JSON string
    `timestamp` DateTime,
    `received_at` DateTime DEFAULT now()

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp, event_type, user_id"
ENGINE_TTL "timestamp + INTERVAL 90 DAY"

Dimension Table

# datasources/users.datasource

DESCRIPTION >
    User dimension table with latest state

SCHEMA >
    `user_id` String,
    `email` String,
    `name` String,
    `plan` String,
    `created_at` DateTime,
    `updated_at` DateTime

ENGINE "ReplacingMergeTree"
ENGINE_SORTING_KEY "user_id"
ENGINE_VER "updated_at"

Pipes (Transformations)

Materialized View

# pipes/daily_events.pipe

DESCRIPTION >
    Pre-aggregated daily event counts

NODE daily_aggregation
SQL >
    SELECT
        toDate(timestamp) AS date,
        event_type,
        user_id,
        count() AS event_count,
        uniqExact(event_id) AS unique_events
    FROM events
    GROUP BY date, event_type, user_id

TYPE materialized
DATASOURCE daily_events_mv
ENGINE "SummingMergeTree"
ENGINE_SORTING_KEY "date, event_type, user_id"

API Endpoint

# pipes/user_activity.pipe

DESCRIPTION >
    Get user activity for a time range

NODE endpoint
SQL >
    %
    SELECT
        toDate(timestamp) AS date,
        event_type,
        count() AS events
    FROM events
    WHERE
        user_id = {{ String(user_id, required=True) }}
        AND timestamp >= {{ DateTime(start_date, '2024-01-01 00:00:00') }}
        AND timestamp < {{ DateTime(end_date, '2024-12-31 23:59:59') }}
    GROUP BY date, event_type
    ORDER BY date DESC, events DESC

TYPE endpoint

Chained Transformations

# pipes/top_users.pipe

NODE filter_events
SQL >
    SELECT user_id, event_type, timestamp
    FROM events
    WHERE timestamp >= now() - INTERVAL 7 DAY

NODE aggregate_by_user
SQL >
    SELECT
        user_id,
        count() AS total_events,
        uniqExact(event_type) AS unique_event_types
    FROM filter_events
    GROUP BY user_id

NODE enrich_with_user_data
SQL >
    SELECT
        a.user_id,
        u.name,
        u.email,
        u.plan,
        a.total_events,
        a.unique_event_types
    FROM aggregate_by_user a
    LEFT JOIN users u ON a.user_id = u.user_id

NODE rank_users
SQL >
    SELECT
        *,
        row_number() OVER (ORDER BY total_events DESC) AS rank
    FROM enrich_with_user_data
    LIMIT {{ Int32(limit, 100) }}

TYPE endpoint

Ingestion

Events API

# Single event
curl -X POST \
  'https://api.tinybird.co/v0/events?name=events' \
  -H "Authorization: Bearer $TB_TOKEN" \
  -d '{"event_id":"123","event_type":"click","user_id":"u1","timestamp":"2024-01-15 10:30:00"}'

# Batch NDJSON
curl -X POST \
  'https://api.tinybird.co/v0/events?name=events' \
  -H "Authorization: Bearer $TB_TOKEN" \
  --data-binary @events.ndjson

SDK Integration

// TypeScript SDK
import { Tinybird } from '@chronark/zod-bird';

const tb = new Tinybird({ token: process.env.TINYBIRD_TOKEN! });

// Ingest events
await tb.ingest('events', [
  {
    event_id: crypto.randomUUID(),
    event_type: 'page_view',
    user_id: 'user_123',
    properties: JSON.stringify({ page: '/home' }),
    timestamp: new Date().toISOString(),
  },
]);

Querying Endpoints

HTTP API

# Query endpoint
curl -G \
  'https://api.tinybird.co/v0/pipes/user_activity.json' \
  -H "Authorization: Bearer $TB_TOKEN" \
  -d 'user_id=user_123' \
  -d 'start_date=2024-01-01' \
  -d 'end_date=2024-01-31'

TypeScript Client

const response = await fetch(
  `https://api.tinybird.co/v0/pipes/user_activity.json?user_id=${userId}`,
  {
    headers: {
      Authorization: `Bearer ${process.env.TINYBIRD_TOKEN}`,
    },
  }
);

const { data } = await response.json();

ClickHouse SQL Best Practices

Optimize Queries

-- Use appropriate date functions
-- BAD: String comparison
WHERE toString(timestamp) LIKE '2024-01%'

-- GOOD: Native date functions
WHERE toYYYYMM(timestamp) = 202401

-- Use PREWHERE for filtering
SELECT * FROM events
PREWHERE event_type = 'purchase'
WHERE user_id = 'user_123'

-- Limit columns in SELECT
-- BAD
SELECT * FROM events

-- GOOD
SELECT event_id, event_type, timestamp FROM events

Aggregation Patterns

-- Approximate count for large datasets
SELECT uniqHLL12(user_id) AS approx_users
FROM events

-- Window functions
SELECT
    user_id,
    event_type,
    timestamp,
    row_number() OVER (PARTITION BY user_id ORDER BY timestamp) AS event_order
FROM events

-- Array aggregation
SELECT
    user_id,
    groupArray(event_type) AS all_events,
    arrayDistinct(groupArray(event_type)) AS unique_events
FROM events
GROUP BY user_id

JSON Handling

-- Extract from JSON string
SELECT
    JSONExtractString(properties, 'page') AS page,
    JSONExtractInt(properties, 'duration') AS duration
FROM events

-- For frequent access, materialize columns
SCHEMA >
    `properties` String,
    `page` String MATERIALIZED JSONExtractString(properties, 'page')

CI/CD

GitHub Actions

# .github/workflows/deploy.yml
name: Deploy Tinybird
on:
  push:
    branches: [main]
    paths: ['tinybird/**']

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Tinybird CLI
        run: pip install tinybird-cli

      - name: Deploy to Tinybird
        run: |
          cd tinybird
          tb auth --token ${{ secrets.TINYBIRD_TOKEN }}
          tb push --force

Testing

# tests/pipes/events_per_day.yaml
- name: test_basic_aggregation
  pipe: events_per_day
  params:
    start_date: '2024-01-01'
    end_date: '2024-01-02'
  expected:
    - date: '2024-01-01'
      events: 100
# Run tests
tb test run

Best Practices

  • Use appropriate ENGINE (MergeTree, SummingMergeTree, etc.)
  • Set TTL for data retention
  • Use materialized views for common aggregations
  • Optimize sorting keys for query patterns
  • Use PREWHERE for early filtering
  • Test with production-like data volumes
  • Monitor query performance
  • Use approximate functions for large datasets

Environment Variables

# .env (gitignored)
TINYBIRD_TOKEN=p.xxx
TINYBIRD_HOST=https://api.tinybird.co

# .gitignore
.tinyb
.env
.env.local
Skills Info
Original Name:tinybird-analyticsAuthor:allanninal