Agent Skill
2/7/2026tinybird-analytics
Tinybird real-time analytics - datasources, pipes, endpoints, and ClickHouse best practices. Use when building analytics backends or real-time data pipelines.
A
allanninal
0GitHub Stars
1Views
npx skills add allanninal/claude-code-skills
SKILL.md
| Name | tinybird-analytics |
| Description | Tinybird real-time analytics - datasources, pipes, endpoints, and ClickHouse best practices. Use when building analytics backends or real-time data pipelines. |
name: tinybird-analytics description: Tinybird real-time analytics - datasources, pipes, endpoints, and ClickHouse best practices. Use when building analytics backends or real-time data pipelines.
Tinybird Real-Time Analytics
When to Use This Skill
- Building real-time analytics backends
- Creating API endpoints from SQL queries
- Processing streaming data
- Building dashboards and metrics
- Working with ClickHouse SQL
Project Structure
tinybird/
├── datasources/
│ ├── events.datasource # Raw event ingestion
│ └── users.datasource # User dimension table
├── pipes/
│ ├── events_per_day.pipe # Aggregation pipe
│ └── user_activity.pipe # API endpoint
├── tests/
│ └── pipes/
│ └── events_per_day.yaml # Pipe tests
├── .tinyb # Auth token (gitignored)
└── .gitignore
Datasources
Event Stream Datasource
# datasources/events.datasource
DESCRIPTION >
Raw events from application
SCHEMA >
`event_id` String,
`event_type` String,
`user_id` String,
`properties` String, # JSON string
`timestamp` DateTime,
`received_at` DateTime DEFAULT now()
ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp, event_type, user_id"
ENGINE_TTL "timestamp + INTERVAL 90 DAY"
Dimension Table
# datasources/users.datasource
DESCRIPTION >
User dimension table with latest state
SCHEMA >
`user_id` String,
`email` String,
`name` String,
`plan` String,
`created_at` DateTime,
`updated_at` DateTime
ENGINE "ReplacingMergeTree"
ENGINE_SORTING_KEY "user_id"
ENGINE_VER "updated_at"
Pipes (Transformations)
Materialized View
# pipes/daily_events.pipe
DESCRIPTION >
Pre-aggregated daily event counts
NODE daily_aggregation
SQL >
SELECT
toDate(timestamp) AS date,
event_type,
user_id,
count() AS event_count,
uniqExact(event_id) AS unique_events
FROM events
GROUP BY date, event_type, user_id
TYPE materialized
DATASOURCE daily_events_mv
ENGINE "SummingMergeTree"
ENGINE_SORTING_KEY "date, event_type, user_id"
API Endpoint
# pipes/user_activity.pipe
DESCRIPTION >
Get user activity for a time range
NODE endpoint
SQL >
%
SELECT
toDate(timestamp) AS date,
event_type,
count() AS events
FROM events
WHERE
user_id = {{ String(user_id, required=True) }}
AND timestamp >= {{ DateTime(start_date, '2024-01-01 00:00:00') }}
AND timestamp < {{ DateTime(end_date, '2024-12-31 23:59:59') }}
GROUP BY date, event_type
ORDER BY date DESC, events DESC
TYPE endpoint
Chained Transformations
# pipes/top_users.pipe
NODE filter_events
SQL >
SELECT user_id, event_type, timestamp
FROM events
WHERE timestamp >= now() - INTERVAL 7 DAY
NODE aggregate_by_user
SQL >
SELECT
user_id,
count() AS total_events,
uniqExact(event_type) AS unique_event_types
FROM filter_events
GROUP BY user_id
NODE enrich_with_user_data
SQL >
SELECT
a.user_id,
u.name,
u.email,
u.plan,
a.total_events,
a.unique_event_types
FROM aggregate_by_user a
LEFT JOIN users u ON a.user_id = u.user_id
NODE rank_users
SQL >
SELECT
*,
row_number() OVER (ORDER BY total_events DESC) AS rank
FROM enrich_with_user_data
LIMIT {{ Int32(limit, 100) }}
TYPE endpoint
Ingestion
Events API
# Single event
curl -X POST \
'https://api.tinybird.co/v0/events?name=events' \
-H "Authorization: Bearer $TB_TOKEN" \
-d '{"event_id":"123","event_type":"click","user_id":"u1","timestamp":"2024-01-15 10:30:00"}'
# Batch NDJSON
curl -X POST \
'https://api.tinybird.co/v0/events?name=events' \
-H "Authorization: Bearer $TB_TOKEN" \
--data-binary @events.ndjson
SDK Integration
// TypeScript SDK
import { Tinybird } from '@chronark/zod-bird';
const tb = new Tinybird({ token: process.env.TINYBIRD_TOKEN! });
// Ingest events
await tb.ingest('events', [
{
event_id: crypto.randomUUID(),
event_type: 'page_view',
user_id: 'user_123',
properties: JSON.stringify({ page: '/home' }),
timestamp: new Date().toISOString(),
},
]);
Querying Endpoints
HTTP API
# Query endpoint
curl -G \
'https://api.tinybird.co/v0/pipes/user_activity.json' \
-H "Authorization: Bearer $TB_TOKEN" \
-d 'user_id=user_123' \
-d 'start_date=2024-01-01' \
-d 'end_date=2024-01-31'
TypeScript Client
const response = await fetch(
`https://api.tinybird.co/v0/pipes/user_activity.json?user_id=${userId}`,
{
headers: {
Authorization: `Bearer ${process.env.TINYBIRD_TOKEN}`,
},
}
);
const { data } = await response.json();
ClickHouse SQL Best Practices
Optimize Queries
-- Use appropriate date functions
-- BAD: String comparison
WHERE toString(timestamp) LIKE '2024-01%'
-- GOOD: Native date functions
WHERE toYYYYMM(timestamp) = 202401
-- Use PREWHERE for filtering
SELECT * FROM events
PREWHERE event_type = 'purchase'
WHERE user_id = 'user_123'
-- Limit columns in SELECT
-- BAD
SELECT * FROM events
-- GOOD
SELECT event_id, event_type, timestamp FROM events
Aggregation Patterns
-- Approximate count for large datasets
SELECT uniqHLL12(user_id) AS approx_users
FROM events
-- Window functions
SELECT
user_id,
event_type,
timestamp,
row_number() OVER (PARTITION BY user_id ORDER BY timestamp) AS event_order
FROM events
-- Array aggregation
SELECT
user_id,
groupArray(event_type) AS all_events,
arrayDistinct(groupArray(event_type)) AS unique_events
FROM events
GROUP BY user_id
JSON Handling
-- Extract from JSON string
SELECT
JSONExtractString(properties, 'page') AS page,
JSONExtractInt(properties, 'duration') AS duration
FROM events
-- For frequent access, materialize columns
SCHEMA >
`properties` String,
`page` String MATERIALIZED JSONExtractString(properties, 'page')
CI/CD
GitHub Actions
# .github/workflows/deploy.yml
name: Deploy Tinybird
on:
push:
branches: [main]
paths: ['tinybird/**']
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Tinybird CLI
run: pip install tinybird-cli
- name: Deploy to Tinybird
run: |
cd tinybird
tb auth --token ${{ secrets.TINYBIRD_TOKEN }}
tb push --force
Testing
# tests/pipes/events_per_day.yaml
- name: test_basic_aggregation
pipe: events_per_day
params:
start_date: '2024-01-01'
end_date: '2024-01-02'
expected:
- date: '2024-01-01'
events: 100
# Run tests
tb test run
Best Practices
- Use appropriate ENGINE (MergeTree, SummingMergeTree, etc.)
- Set TTL for data retention
- Use materialized views for common aggregations
- Optimize sorting keys for query patterns
- Use PREWHERE for early filtering
- Test with production-like data volumes
- Monitor query performance
- Use approximate functions for large datasets
Environment Variables
# .env (gitignored)
TINYBIRD_TOKEN=p.xxx
TINYBIRD_HOST=https://api.tinybird.co
# .gitignore
.tinyb
.env
.env.local
Skills Info
Original Name:tinybird-analyticsAuthor:allanninal
Download