name: daily-calibration-report description: Automated daily health check of all model components for early warning of degradation and drift. Use when setting up daily reports, debugging performance issues, or reviewing model health after deployment. Covers PnL attribution, calibration metrics, signal decay, regime analysis, and alerting. requires:

calibration-analysis disable-model-invocation: true context: fork agent: analytics

Daily Calibration Report Skill

Purpose

Automated daily health check of all model components. This is your early warning system for:

Model degradation
Regime shifts
Parameter drift
Unexpected losses

When to Use

Run automatically every day at market close or 6am
Manually when debugging performance issues
After deploying model changes
When something "feels off"

Prerequisites

measurement-infrastructure collecting data
calibration-analysis for metrics computation
At least 24 hours of production data

Report Sections

1. Executive Summary

╔════════════════════════════════════════════════════════════════════╗
║             DAILY CALIBRATION REPORT - 2024-01-15                  ║
╠════════════════════════════════════════════════════════════════════╣
║  Overall Status: WARNING                                           ║
║                                                                    ║
║  PnL:           $347.82  (+12% vs 7d avg)                         ║
║  Sharpe (1d):   2.1                                               ║
║  Fill Rate:     847 fills                                         ║
║  Uptime:        99.7%                                             ║
║                                                                    ║
║  Issues:                                                           ║
║  - Adverse selection model IR dropped to 0.92 (threshold: 1.0)    ║
║  - Cascade regime PnL: -$45 (investigate)                         ║
╚════════════════════════════════════════════════════════════════════╝

2. PnL Attribution

Breaks down daily PnL into: spread capture, adverse selection, inventory cost, fees. Compared against 7-day rolling average. Hourly PnL histogram for time-of-day analysis.

3. Model Calibration

Per-model metrics: Brier Score, Information Ratio, 7-day IR trend. Models tracked: Fill Prediction (1s, 10s), Adverse Selection, Direction (1s), Regime Detection. Includes calibration curve visualization (predicted vs realized).

4. Signal Health

Mutual information per signal vs 7 days ago. Flags signals with >15% MI decay for investigation. Tracks lead-lag R-squared for cross-exchange signals.

5. Regime Distribution

Time fraction, PnL, PnL/hour, fill count, and AS rate per regime (Quiet/Trending/Volatile/Cascade). Cascade losses are highlighted for investigation.

6. Conditional Calibration Issues

Sliced calibration by: volatility regime, funding state, time of day. Flags conditions where model IR drops below 1.0 — these are where the model adds noise.

7. Actionable Items

Prioritized list generated from all sections:

HIGH: Model IR < 1.0, daily loss exceeding limit, AS > 50% of spread capture
MEDIUM: Cascade losses, signal decay > 15%, conditional calibration issues
LOW: Elevated inventory cost, minor parameter drift

See references/implementation.md for the DailyReportGenerator struct, generate() pipeline, action item generation, and status determination logic.

Alert Thresholds

Metric	Warning	Critical	Action
Model IR	< 1.0	< 0.8	Retrain or disable
Brier Score	> 0.20	> 0.25	Investigate calibration
Daily Loss	> $200	> $500	Reduce size, investigate
Cascade Loss	> $50	> $100	Tighten liquidation detection
Signal MI Decay	> 15%/week	> 30%/week	Remove signal
Lead-Lag R-squared	< 0.20	< 0.15	Reduce reliance

See references/alerting.md for the AlertManager implementation and alert routing (Slack/Email/PagerDuty).

Dashboard Integration

Export metrics to Grafana/Prometheus for real-time monitoring. Key panels: PnL Waterfall, Model IR Trend, Signal MI Heatmap, Regime Timeline, Calibration Curve.

See references/dashboard.md for export_metrics() code and panel definitions.

Operational Checklist

Report job scheduled (cron, systemd timer, etc.)
Slack webhook configured
Email alerts configured
PagerDuty integration for critical alerts
Dashboard panels created
Historical reports archived
Weekly review meeting scheduled

Dependencies

Requires: measurement-infrastructure, calibration-analysis, all model components
Enables: Automated monitoring, early warning, performance tracking

Common Mistakes

Not running daily: Problems compound if undetected
Ignoring warnings: "It's probably fine" leads to losses
Too many alerts: Alert fatigue defeats the purpose
No historical comparison: Need 7d context for anomaly detection
Missing conditional analysis: Aggregate metrics hide regime-specific issues

Supporting Files

references/implementation.md — DailyReportGenerator, generate() pipeline, action items, status logic
references/alerting.md — AlertManager, Slack/Email/PagerDuty routing
references/dashboard.md — Grafana export, metric panels

Name	daily-calibration-report
Description	Automated daily health check of all model components for early warning of degradation and drift. Use when setting up daily reports, debugging performance issues, or reviewing model health after deployment. Covers PnL attribution, calibration metrics, signal decay, regime analysis, and alerting.

daily-calibration-report

SKILL.md