testing
Comprehensive testing skill covering unit, integration, and E2E testing with pytest, Jest, Cypress, and Playwright. Use for writing tests, improving coverage, debugging test failures, and setting up testing infrastructure.
SKILL.md
| Name | testing |
| Description | Comprehensive testing skill covering unit, integration, and E2E testing with pytest, Jest, Cypress, and Playwright. Use for writing tests, improving coverage, debugging test failures, and setting up testing infrastructure. |
๐ฐ Supercharge Microsoft Fabric ๐ฒ
Casino & Gaming Industry POC + Federal Expansions
Transform your casino operations with enterprise-grade analytics powered by Microsoft Fabric
Real-time insights โข Medallion Architecture โข Regulatory Compliance โข Direct Lake BI
๐ Documentation โข ๐ Quick Start โข ๐ณ Docker โข ๐ Tutorials โข ๐๏ธ Architecture โข ๐ POC Agenda
</div>
๐ Navigation
Home / README
| Section | Description |
|---|---|
| ๐ฏ Overview | What this POC delivers |
| ๐ฅ Target Audience | Who should use this |
| ๐ Quick Start | Get up and running |
| โก 5-Minute Quick Start | Fastest path to first results |
| ๐ Cheat Sheet | Quick reference & commands |
| ๐ณ Docker Support | Container-based deployment |
| ๐ป Dev Container | One-click development setup |
| ๐ Power BI Reports | Pre-built report templates |
| ๐ฐ Cost Estimation | Azure cost planning |
| ๐ Sample Data | Pre-generated datasets |
| ๐๏ธ Architecture | Technical deep-dive |
| ๐ฐ Data Domains | Gaming-specific domains |
| ๐ Repository Structure | What's included |
| ๐ POC Agenda | Workshop schedule |
| ๐ Tutorials | Learning path |
| ๐ Documentation Site | Full docs with search |
| ๐ Compliance | Regulatory coverage |
| ๐๏ธ Phase 7 Expansions | Federal, streaming, analytics expansions |
| ๐ Phase 9-10 New Fabric Experience | 40+ new feature docs, best practices, Bicep modules |
๐ฏ Overview
This repository provides a complete, production-ready proof-of-concept environment for Microsoft Fabric, purpose-built for the casino and gaming industry.
<table> <tr> <td width="50%">โจ Key Features
| Feature | Description |
|---|---|
| ๐๏ธ Medallion Architecture | Bronze/Silver/Gold Lakehouse |
| โก Real-Time Intelligence | Casino floor monitoring |
| ๐ Direct Lake | Sub-second Power BI analytics |
| ๐ Microsoft Purview | Data governance & compliance |
| ๐ Infrastructure as Code | Bicep/ARM deployment |
| ๐ Step-by-Step Tutorials | Hands-on learning path |
๐ Value Proposition
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ฐ REAL-TIME SLOT TELEMETRY โ
โ ๐ฒ TABLE GAME ANALYTICS โ
โ ๐ค PLAYER 360 INSIGHTS โ
โ ๐ฐ FINANCIAL COMPLIANCE โ
โ ๐ SECURITY & SURVEILLANCE โ
โ ๐ REGULATORY REPORTING โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
</td>
</tr>
</table>
๐ฅ Target Audience
<table> <tr> <td align="center" width="16%"> <h2>๐๏ธ</h2> <b>Data Architects</b><br/> <sub>Evaluating Fabric</sub> </td> <td align="center" width="16%"> <h2>โ๏ธ</h2> <b>Data Engineers</b><br/> <sub>Medallion patterns</sub> </td> <td align="center" width="16%"> <h2>๐</h2> <b>BI Developers</b><br/> <sub>Direct Lake solutions</sub> </td> <td align="center" width="16%"> <h2>๐</h2> <b>Solution Architects</b><br/> <sub>Enterprise platforms</sub> </td> <td align="center" width="16%"> <h2>๐ฐ</h2> <b>Gaming Industry</b><br/> <sub>Regulated operations</sub> </td> <td align="center" width="16%"> <h2>๐จ</h2> <b>Hospitality</b><br/> <sub>Guest analytics</sub> </td> </tr> </table>๐ Quick Start
Choose your preferred deployment method:
| Method | Best For | Time to Start |
|---|---|---|
| ๐ณ Docker Quick Start | Quick demos, testing data generators | ~5 minutes |
| ๐ป Dev Container | Full development environment | ~10 minutes |
| โ๏ธ Azure Deployment | Production-like POC environment | ~30 minutes |
๐ Two Ways to Run This POC
Path A (Production-Aligned): Deploy Azure infrastructure via Bicep (
infra/main.bicep), upload data to ADLS Gen2, and connect it to Fabric via OneLake shortcuts. This unlocks governance (Purview), security (Private Endpoints), and monitoring tutorials. Cost: ~$1-3/day idle.Path B (Quickstart): Skip Bicep entirely โ upload generated data straight into your Fabric Lakehouse via the UI and start running notebooks immediately. Fastest path to learning the medallion architecture. Upgrade to Path A anytime.
See Tutorial 00 โ Step 4 for details.
Docker Quick Start
The fastest way to generate sample data and explore the POC.
# Clone the repository
git clone https://github.com/fgarofalo56/Suppercharge_Microsoft_Fabric.git
cd Suppercharge_Microsoft_Fabric
# Generate demo data (1000 events, 7 days)
docker-compose run --rm demo-generator
# Generate full dataset (30 days, all domains)
docker-compose run --rm data-generator
# Output will be in ./output directory
๐ก Pro Tip: Use the demo generator for quick testing (generates in ~2 minutes), or the full data generator for realistic POC scenarios with 30 days of data.
See Docker Support for more options.
Dev Container Quick Start
One-click development environment with all tools pre-configured.
VS Code:
- Install the Dev Containers extension
- Open this repository in VS Code
- Click "Reopen in Container" when prompted (or use
Ctrl+Shift+P> "Dev Containers: Reopen in Container")
GitHub Codespaces:
- Click the green "Code" button on GitHub
- Select "Codespaces" tab
- Click "Create codespace on main"
๐ก Pro Tip: GitHub Codespaces provides a cloud-based development environment with no local installation required. Perfect for team collaboration and workshops.
See Dev Container for configuration details.
Azure Deployment
๐ Prerequisites: Complete the full Prerequisites Guide before starting deployment. This includes Azure subscription setup, tool installation, and resource provider registration.
Prerequisites Checklist
- Azure subscription with Owner or Contributor access
- Microsoft Fabric capacity (F64 recommended for POC)
- Azure CLI 2.50+ with Bicep extension
- PowerShell 7+ or Bash
- Git installed
- Docker (optional, for data generation)
Step-by-Step Deployment
<table> <tr> <td width="80">1๏ธโฃ
</td> <td>Clone the Repository
git clone https://github.com/fgarofalo56/Suppercharge_Microsoft_Fabric.git
cd Suppercharge_Microsoft_Fabric
</td>
</tr>
<tr>
<td>
2๏ธโฃ
</td> <td>Configure Environment
cp .env.sample .env
# Edit .env with your Azure subscription and tenant details
</td> </tr> <tr> <td>โ ๏ธ Warning: Ensure all required environment variables are populated. Missing values will cause deployment failures.
3๏ธโฃ
</td> <td>Login to Azure
az login
az account set --subscription "<your-subscription-id>"
</td>
</tr>
<tr>
<td>
4๏ธโฃ
</td> <td>Deploy Infrastructure
az deployment sub create \
--location eastus2 \
--template-file infra/main.bicep \
--parameters infra/environments/dev/dev.bicepparam
</td> </tr> <tr> <td>๐ก Pro Tip: Run
az deployment sub what-iffirst to preview all resource changes before actual deployment.
5๏ธโฃ
</td> <td>Start Learning
๐ Begin with Tutorial 00: Environment Setup
</td> </tr> </table>๐๏ธ Architecture
High-Level Data Flow
flowchart TB
subgraph Sources["๐ฅ Data Sources"]
RT[/"โก Real-Time<br/>Casino Floor"/]
BATCH[/"๐ฆ Batch<br/>Systems"/]
EXT[/"๐ External<br/>APIs"/]
end
subgraph Ingestion["๐ Ingestion Layer"]
ES[Eventstreams]
DF[Dataflows Gen2]
MR[Database Mirroring]
end
subgraph Medallion["๐๏ธ Medallion Architecture"]
subgraph Bronze["๐ฅ BRONZE"]
BL[(Bronze Lakehouse<br/>Raw Data)]
end
subgraph Silver["๐ฅ SILVER"]
SL[(Silver Lakehouse<br/>Cleansed)]
end
subgraph Gold["๐ฅ GOLD"]
GL[(Gold Lakehouse<br/>Business Ready)]
end
end
subgraph Analytics["๐ Analytics & Governance"]
DL[Direct Lake<br/>Semantic Model]
EH[Eventhouse<br/>KQL Analytics]
PV[Microsoft Purview<br/>Governance]
end
subgraph Consumption["๐๏ธ Consumption"]
PBI[Power BI<br/>Reports]
RTD[Real-Time<br/>Dashboards]
API[REST APIs]
end
RT --> ES
BATCH --> DF
EXT --> MR
ES --> BL
DF --> BL
MR --> BL
BL --> SL
SL --> GL
GL --> DL
GL --> EH
GL -.-> PV
SL -.-> PV
BL -.-> PV
DL --> PBI
EH --> RTD
GL --> API
style Bronze fill:#cd7f32,color:#fff
style Silver fill:#c0c0c0,color:#000
style Gold fill:#ffd700,color:#000
Architecture Highlights
<details> <summary><b>๐ฅ Bronze Layer - Raw Ingestion</b></summary>- Purpose: Land raw data with minimal transformation
- Pattern: Schema-on-read, append-only
- Format: Delta Lake tables
- Retention: Configurable (default 90 days)
- Key Feature: Full historical lineage preserved
- Purpose: Business rules and data quality
- Pattern: Slowly Changing Dimensions (SCD Type 2)
- Transformations: Deduplication, validation, standardization
- Data Quality: Great Expectations integration
- Key Feature: Audit-ready data lineage
- Purpose: Aggregations, KPIs, and business metrics
- Pattern: Star/Snowflake schema
- Optimization: Partitioned by date, optimized for queries
- Refresh: Incremental or scheduled
- Key Feature: Direct Lake semantic model integration
- Eventstreams: Apache Kafka-compatible streaming
- Eventhouse: KQL-based analytics database
- Latency: Sub-second to seconds
- Use Cases: Slot monitoring, player alerts, anomaly detection
๐๏ธ Phase 7: Industry Expansions
[!NOTE] Phase 7 Complete โ 71 features delivered across 5 waves with 197/197 tests passing and zero regressions.
Phase 7 expanded the Casino/Gaming POC to cover federal agencies, migration paths, streaming connectors, analytics pipelines, tribal healthcare, and DOT/FAA transportation.
flowchart TD
subgraph Core["๐ฐ Casino/Gaming POC (Phases 1-6)"]
C[Reference Implementation<br/>92/100 Audit Score]
end
subgraph W1["๐๏ธ Wave 1: Federal Agencies"]
USDA[USDA<br/>Crop & Food Safety]
SBA[SBA<br/>PPP & 7a Loans]
NOAA[NOAA<br/>Weather & Storms]
EPA[EPA<br/>Air & Water Quality]
DOI[DOI<br/>Earthquakes & Land]
DOJ[DOJ<br/>Crime & Antitrust]
end
subgraph W2["๐ Wave 2: Migration & Streaming"]
MIG[Migration Tutorials<br/>Snowflake ยท DB2 ยท Teradata]
STR[8 Streaming Notebooks<br/>CDC ยท IoT ยท Kafka]
end
subgraph W3["๐ Wave 3: Analytics"]
VID[Video Security<br/>YOLO ยท DeepSORT]
MOV[People Movement<br/>30 Zones ยท Queue Detection]
GEO[Geolocation<br/>H3 ยท Geofencing]
end
subgraph W4["๐ฅ Wave 4: Expansions"]
TH[Tribal Healthcare<br/>HIPAA ยท IHS ยท FHIR]
DOT[DOT/FAA<br/>FedRAMP ยท Aviation]
end
C --> W1
C --> W2
C --> W3
W1 --> W4
W2 --> W4
W3 --> W4
style Core fill:#ffd700,color:#000
style W1 fill:#4a90d9,color:#fff
style W2 fill:#50c878,color:#000
style W3 fill:#ff6b6b,color:#fff
style W4 fill:#9b59b6,color:#fff
| Wave | Scope | Features | Tests | Status |
|---|---|---|---|---|
| Wave 1 | Federal Agencies (USDA, SBA, NOAA, EPA, DOI) | 26 | 54 | ๐ข Complete |
| Wave 2 | Migration & Streaming | 19 | 20 | ๐ข Complete |
| Wave 3 | Video, Movement, Geolocation Analytics | 12 | 30 | ๐ข Complete |
| Wave 4 | Tribal Healthcare + DOT/FAA | 15 | โ | ๐ข Complete |
| Wave 5 | Final Regression | 1 | 197 | ๐ข Complete |
| Total | 71 | 197 | All Complete |
๐ Phase 9-10: New Fabric Experience
[!NOTE] Phases 9-10 Complete โ Full coverage of the new Microsoft Fabric experience (July 2025 โ April 2026 GA wave) with 40+ new documents, 8 Bicep modules, and 269/269 tests passing.
Phases 9 and 10 modernize the POC for the new Fabric experience, covering every major feature and enterprise best practice.
New Feature Documentation (22 features)
| Category | Features |
|---|---|
| AI & Intelligence | Fabric IQ, AI Copilot, Data Agents, AutoML & Model Endpoints, Fabric MCP |
| Data Integration | Mirroring (Oracle/SAP/BigQuery/MySQL), Copy Job CDC, dbt Integration |
| Analytics | Direct Lake, Real-Time Intelligence, Semantic Link, Eventhouse Vector DB |
| Platform | Fabric SQL Database, API for GraphQL, Translytical Task Flows, Digital Twin Builder |
| Governance | OneLake Security, OneLake Catalog, Workspace Monitoring, Data Mesh, Iceberg Interop |
| Performance | Materialized Lake Views, Lakehouse Schemas, Shortcut Transformations |
Enterprise Best Practices (16 guides)
| Category | Guides |
|---|---|
| Operations | Capacity Planning & Cost Optimization, Monitoring & Observability, Testing Strategies |
| Security | Network Security (PE/VNet/IP Firewall), Identity & RBAC, Customer-Managed Keys, Outbound Access Protection |
| Architecture | Medallion Deep Dive, Multi-Tenant Workspace, Data Sharing & Federation, Migration Patterns |
| Data Engineering | Incremental Refresh & CDC, fabric-cicd CI/CD, Spark Runtime Migration, SQL Audit Logs |
| Resilience | Disaster Recovery & BCDR, Alerting & Data Activator |
Infrastructure (Bicep)
| Module | Purpose |
|---|---|
fabric-warehouse.bicep | Fabric Warehouse configuration metadata |
fabric-sql-database.bicep | Fabric SQL Database with DDM & CMK |
fabric-pipeline.bicep | Data Factory Pipeline with scheduling |
alerts-and-budgets.bicep | Capacity alerts & budget management |
workspace-identity.bicep | Workspace Identity (GA 2026) |
๐ See Feature Documentation and Best Practices for the complete guides.
๐ณ Docker Support
Run the data generators and validation tools without installing any dependencies.
Available Services
| Service | Command | Description |
|---|---|---|
data-generator | docker-compose run --rm data-generator | Generate full dataset (30 days) |
demo-generator | docker-compose run --rm demo-generator | Quick demo dataset (7 days, smaller volumes) |
streaming-generator | docker-compose up streaming-generator | Real-time streaming to Event Hub |
data-validator | docker-compose run --rm data-validator | Validate generated data |
Common Commands
# Build the Docker image
docker-compose build
# Generate all data with custom parameters
docker-compose run --rm data-generator --all --days 14 --format parquet
# Generate specific data types
docker-compose run --rm data-generator --slots 50000 --players 1000
# Stream events to Azure Event Hub (requires configuration)
EVENTHUB_CONNECTION_STRING="your-connection-string" \
EVENTHUB_NAME="slot-telemetry" \
docker-compose up streaming-generator
# Run validation on generated data
docker-compose run --rm data-validator
Environment Variables
| Variable | Default | Description |
|---|---|---|
DATA_FORMAT | parquet | Output format (parquet, csv, json) |
DATA_DAYS | 30 | Days of historical data to generate |
EVENTHUB_CONNECTION_STRING | - | Azure Event Hub connection string |
EVENTHUB_NAME | slot-telemetry | Event Hub name for streaming |
STREAMING_RATE | 10 | Events per second for streaming |
For detailed Docker documentation, see docker/README.md.
๐ป Dev Container
The Dev Container provides a complete, pre-configured development environment with all necessary tools.
Included Tools
| Tool | Version | Purpose |
|---|---|---|
| Python | 3.11 | Data generation, notebooks |
| Azure CLI | Latest | Azure resource management |
| Bicep | Latest | Infrastructure as Code |
| Git | Latest | Version control |
| PowerShell | 7.x | Scripting |
| Docker CLI | Latest | Container management |
VS Code Extensions (Pre-installed)
- Azure Account
- Bicep
- Python
- Jupyter
- Docker
- GitHub Copilot (if licensed)
- Power BI (preview)
Features
- Automatic Python environment: Virtual environment created on container start
- Azure CLI authentication: Sign in once, stay authenticated
- Port forwarding: Automatic forwarding for Jupyter and other services
- GitHub Codespaces ready: Same experience in the cloud
Configuration Files
.devcontainer/
โโโ devcontainer.json # Main configuration
โโโ Dockerfile # Container image definition
โโโ post-create.sh # Post-creation setup script
For customization options, see the Dev Containers documentation.
๐ Power BI Reports
Pre-built Power BI report templates and semantic model definitions for quick deployment.
Available Reports
| Report | Description | Key Visuals |
|---|---|---|
| Casino Executive Dashboard | High-level KPIs and trends | Revenue trends, floor performance, player metrics |
| Slot Performance Analysis | Machine-level analytics | Hold percentage, utilization, jackpot frequency |
| Player 360 View | Customer analytics | Player segments, lifetime value, visit patterns |
| Compliance Monitoring | Regulatory reporting | CTR/SAR status, W-2G tracking, audit trails |
| Real-Time Floor Monitor | Live casino floor status | Machine status, alerts, occupancy |
Report Locations
reports/
โโโ report-definitions/ # Power BI report definition files
โ โโโ executive-dashboard/
โ โโโ slot-performance/
โ โโโ player-360/
โโโ semantic-model/ # Direct Lake semantic model
โโโ tables/ # Table definitions
โโโ measures/ # DAX measures
How to Import
- Connect to Fabric Workspace: Open Power BI Desktop, connect to your Fabric workspace
- Import Semantic Model: Use the definitions in
reports/semantic-model/ - Import Reports: Open
.pbipfiles fromreports/report-definitions/ - Configure Data Source: Point to your Gold layer Lakehouse
For detailed instructions, see reports/README.md.
๐ฐ Cost Estimation
Understand Azure costs before deployment with our comprehensive cost guide.
Quick Reference
| Environment | Fabric SKU | Monthly Estimate | Notes |
|---|---|---|---|
| Development | F4 | $450 - $650 | 8 hrs/day weekdays |
| Staging | F16 | $1,800 - $2,500 | 12 hrs/day weekdays |
| Production POC | F64 | $9,500 - $12,500 | 24/7 operation |
| Production Pilot | F64 Reserved | $6,500 - $9,000 | 1-year reserved |
Cost Breakdown (Production POC)
| Component | Monthly Cost | % of Total |
|---|---|---|
| Fabric Capacity (F64) | ~$8,500 | 75-80% |
| ADLS Gen2 Storage | ~$500 | 4-5% |
| Microsoft Purview | ~$800 | 7-8% |
| Log Analytics | ~$300 | 2-3% |
| Key Vault | ~$10 | <1% |
| Networking | ~$200 | 1-2% |
Cost Optimization Tips
- Pause capacity during off-hours (saves up to 76%)
- Use reserved capacity for production (saves 25-30%)
- Implement storage lifecycle policies (move cold data to Cool tier)
- Set up Azure Cost Management alerts
๐ก Pro Tip: Enable auto-pause on dev/staging environments to automatically suspend compute during idle periods. This can reduce costs by up to 76% for non-production workloads.
For detailed cost scenarios and optimization strategies, see docs/COST_ESTIMATION.md.
๐ Sample Data
Pre-generated sample datasets for quick exploration without running data generators.
Available Datasets
| Dataset | Records | Format | Size | Location |
|---|---|---|---|---|
| Slot Telemetry (7 days) | 10,000 | CSV/Parquet | ~10 MB | sample-data/bronze/ |
| Player Profiles | 500 | CSV/Parquet | ~1 MB | sample-data/bronze/ |
| Table Games | 2,000 | CSV/Parquet | ~2 MB | sample-data/bronze/ |
| Financial Transactions | 1,000 | CSV/Parquet | ~1 MB | sample-data/bronze/ |
Quick Exploration
# View sample data structure
ls sample-data/bronze/
# Load into Pandas (Python)
import pandas as pd
df = pd.read_parquet('sample-data/bronze/slot_telemetry_sample.parquet')
df.head()
# View schemas
ls sample-data/schemas/
Schema Definitions
Sample data includes matching schema definitions in sample-data/schemas/ that document:
- Column names and data types
- Business descriptions
- Valid value ranges
- PII handling requirements
๐ก Pro Tip: Sample data is perfect for initial exploration and testing notebooks without waiting for data generation. Use it to validate your environment setup before generating full datasets.
For generating larger custom datasets, see data_generation/README.md.
๐ฐ Casino/Gaming Data Domains
<table> <tr> <th width="15%">Domain</th> <th width="5%">Icon</th> <th width="40%">Description</th> <th width="25%">Key Entities</th> <th width="15%">Compliance</th> </tr> <tr> <td><b>Slot Machines</b></td> <td align="center">๐ฐ</td> <td>Telemetry, meter readings, jackpot events, machine performance analytics</td> <td>Machines, Meters, Jackpots, Sessions</td> <td>NIGC MICS</td> </tr> <tr> <td><b>Table Games</b></td> <td align="center">๐ฒ</td> <td>Hand results, chip tracking, table performance, dealer analytics</td> <td>Tables, Games, Hands, Chips</td> <td>NIGC MICS</td> </tr> <tr> <td><b>Player/Loyalty</b></td> <td align="center">๐ค</td> <td>Player profiles, rewards programs, activity tracking, Player 360</td> <td>Players, Tiers, Points, Offers</td> <td>PCI-DSS, PII</td> </tr> <tr> <td><b>Financial/Cage</b></td> <td align="center">๐ฐ</td> <td>Transactions, fills, credits, cash management, cage operations</td> <td>Transactions, Fills, Drops</td> <td>FinCEN BSA</td> </tr> <tr> <td><b>Security</b></td> <td align="center">๐</td> <td>Surveillance integration, access control, incident tracking</td> <td>Events, Incidents, Access Logs</td> <td>State Regs</td> </tr> <tr> <td><b>Compliance</b></td> <td align="center">๐</td> <td>CTR/SAR reporting, W-2G tax forms, regulatory filings</td> <td>CTRs, SARs, W-2Gs, Audits</td> <td>Federal/State</td> </tr> </table>๐ Repository Structure
Suppercharge_Microsoft_Fabric/
โ
โโโ ๐ .devcontainer/ # ๐ป Dev Container configuration
โ โโโ devcontainer.json # VS Code/Codespaces config
โ โโโ Dockerfile # Container image definition
โ
โโโ ๐ .vscode/ # โ๏ธ VS Code settings
โ โโโ settings.json # Workspace settings
โ โโโ extensions.json # Recommended extensions
โ โโโ launch.json # Debug configurations
โ
โโโ ๐ docker/ # ๐ณ Docker configurations
โ โโโ entrypoint.sh # Container entrypoint
โ โโโ generate-all.sh # Data generation script
โ
โโโ ๐ scripts/ # ๐ Automation scripts
โ โโโ deploy.ps1 # Deployment automation
โ โโโ generate-data.ps1 # Data generation wrapper
โ โโโ validate.ps1 # Validation runner
โ
โโโ ๐ infra/ # ๐ Infrastructure as Code (Bicep)
โ โโโ main.bicep # Root orchestration template
โ โโโ ๐ modules/ # Reusable Bicep modules
โ โโโ ๐ environments/ # Environment-specific parameters
โ โโโ dev/ # Development configuration
โ โโโ staging/ # Staging configuration
โ โโโ prod/ # Production configuration
โ
โโโ ๐ docs/ # ๐ Documentation
โ โโโ ARCHITECTURE.md # Detailed architecture guide
โ โโโ DEPLOYMENT.md # Deployment procedures
โ โโโ SECURITY.md # Security & compliance guide
โ โโโ PREREQUISITES.md # Setup requirements
โ โโโ COST_ESTIMATION.md # Azure cost planning
โ
โโโ ๐ tutorials/ # ๐ Step-by-step tutorials
โ โโโ 00-environment-setup/ # Initial setup
โ โโโ 01-bronze-layer/ # Bronze implementation
โ โโโ 02-silver-layer/ # Silver transformations
โ โโโ 03-gold-layer/ # Gold aggregations
โ โโโ 04-real-time-analytics/ # Streaming analytics
โ โโโ 05-direct-lake-powerbi/ # Power BI integration
โ โโโ 06-data-pipelines/ # Pipeline orchestration
โ โโโ 07-governance-purview/ # Data governance
โ โโโ 08-database-mirroring/ # SQL mirroring
โ โโโ 09-advanced-ai-ml/ # Machine learning
โ โโโ 10-teradata-migration/ # Teradata modernization
โ โโโ 24-snowflake-to-fabric/ # Snowflake migration
โ โโโ 25-ibm-db2-source/ # IBM DB2 connectivity
โ โโโ 26-multi-source-streaming/ # CDC & IoT streaming
โ โโโ 27-video-security-analytics/ # AI video pipeline
โ โโโ 28-people-movement-analytics/ # Foot traffic analytics
โ โโโ 29-geolocation-analytics/ # H3 & geofencing
โ โโโ 30-tribal-healthcare/ # HIPAA-compliant IHS
โ โโโ 31-federal-dot-faa/ # FedRAMP aviation
โ
โโโ ๐ sample-data/ # ๐ Pre-generated sample data
โ โโโ bronze/ # Bronze layer samples
โ โโโ schemas/ # Schema definitions
โ
โโโ ๐ reports/ # ๐ Power BI templates
โ โโโ report-definitions/ # Report .pbip files
โ โโโ semantic-model/ # Direct Lake model definitions
โ โโโ tables/ # Table definitions
โ โโโ measures/ # DAX measures
โ
โโโ ๐ poc-agenda/ # ๐
3-Day workshop materials
โโโ ๐ data_generation/ # ๐ฒ Synthetic data generators
โโโ ๐ notebooks/ # ๐ Fabric-importable notebooks
โโโ ๐ validation/ # โ
Testing & data quality
โ
โโโ ๐ณ Dockerfile # Data generator Docker image
โโโ ๐ณ docker-compose.yml # Multi-service orchestration
โโโ ๐ CHANGELOG.md # Version history
๐ 3-Day POC Agenda
A structured workshop to experience the full Microsoft Fabric platform:
| Day | Theme | Focus Areas | Key Deliverables |
|---|---|---|---|
| Day 1 | ๐๏ธ Foundation | Environment setup, Bronze & Silver layers | Working Lakehouse, data ingestion pipeline |
| Day 2 | โ๏ธ Transformation | Gold layer, Real-time analytics | Business-ready datasets, streaming dashboard |
| Day 3 | ๐ Intelligence | Direct Lake, Power BI, Purview | Semantic model, reports, governance catalog |
Day 1: Medallion Foundation (8 hours)
- Morning: Environment provisioning, workspace setup
- Afternoon: Bronze layer implementation, batch ingestion
- Wrap-up: Silver layer transformations, data quality
Day 2: Transformations & Real-Time (8 hours)
- Morning: Gold layer aggregations, star schema
- Afternoon: Eventstreams, Eventhouse, KQL queries
- Wrap-up: Real-time dashboard prototyping
Day 3: BI & Governance (8 hours)
- Morning: Direct Lake semantic model creation
- Afternoon: Power BI reports, Purview integration
- Wrap-up: Review, Q&A, next steps
๐ See POC Agenda for complete schedules and materials.
๐ Tutorials
Learning Path
flowchart LR
subgraph L1["๐ข Level 1: Foundation"]
T00[00-Setup]
T01[01-Bronze]
end
subgraph L2["๐ก Level 2: Core"]
T02[02-Silver]
T03[03-Gold]
end
subgraph L3["๐ Level 3: Advanced"]
T04[04-Real-Time]
T05[05-Direct Lake]
end
subgraph L4["๐ด Level 4: Enterprise"]
T06[06-Pipelines]
T07[07-Governance]
T08[08-Mirroring]
T09[09-AI/ML]
end
subgraph L5["๐ฃ Migration & Streaming"]
T10[10-Teradata]
T24[24-Snowflake]
T25[25-DB2]
T26[26-Streaming]
end
subgraph L6["๐ต Analytics & Expansions"]
T27[27-Video]
T28[28-Movement]
T29[29-Geo]
T30[30-Healthcare]
T31[31-DOT/FAA]
end
T00 --> T01 --> T02 --> T03 --> T04 --> T05
T05 --> T06 --> T07 --> T08 --> T09
T09 --> T10 --> T24 --> T25 --> T26
T26 --> T27 --> T28 --> T29 --> T30 --> T31
<table>
<tr>
<th>๐ฏ Level</th>
<th>๐ Tutorial</th>
<th>๐ Description</th>
<th>โฑ๏ธ Duration</th>
</tr>
<tr>
<td rowspan="2"><b>๐ข Foundation</b><br/><sub>Start here</sub></td>
<td><a href="tutorials/00-environment-setup/README.md"><b>00 - Environment Setup</b></a></td>
<td>Azure & Fabric workspace provisioning</td>
<td><code>~1 hour</code></td>
</tr>
<tr>
<td><a href="tutorials/01-bronze-layer/README.md"><b>01 - Bronze Layer</b></a></td>
<td>Raw data ingestion patterns</td>
<td><code>~2 hours</code></td>
</tr>
<tr>
<td rowspan="2"><b>๐ก Core</b><br/><sub>Essential skills</sub></td>
<td><a href="tutorials/02-silver-layer/README.md"><b>02 - Silver Layer</b></a></td>
<td>Data cleansing & validation</td>
<td><code>~2 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/03-gold-layer/README.md"><b>03 - Gold Layer</b></a></td>
<td>Business aggregations & KPIs</td>
<td><code>~2 hours</code></td>
</tr>
<tr>
<td rowspan="2"><b>๐ Advanced</b><br/><sub>Real-time & BI</sub></td>
<td><a href="tutorials/04-real-time-analytics/README.md"><b>04 - Real-Time Analytics</b></a></td>
<td>Eventstreams & Eventhouse</td>
<td><code>~3 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/05-direct-lake-powerbi/README.md"><b>05 - Direct Lake & Power BI</b></a></td>
<td>Semantic models & reports</td>
<td><code>~2 hours</code></td>
</tr>
<tr>
<td rowspan="4"><b>๐ด Enterprise</b><br/><sub>Production-ready</sub></td>
<td><a href="tutorials/06-data-pipelines/README.md"><b>06 - Data Pipelines</b></a></td>
<td>Orchestration & scheduling</td>
<td><code>~2 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/07-governance-purview/README.md"><b>07 - Governance & Purview</b></a></td>
<td>Data catalog & lineage</td>
<td><code>~2 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/08-database-mirroring/README.md"><b>08 - Database Mirroring</b></a></td>
<td>SQL Server replication</td>
<td><code>~1 hour</code></td>
</tr>
<tr>
<td><a href="tutorials/09-advanced-ai-ml/README.md"><b>09 - Advanced AI/ML</b></a></td>
<td>Machine learning integration</td>
<td><code>~3 hours</code></td>
</tr>
<tr>
<td rowspan="4"><b>๐ฃ Migration</b><br/><sub>Platform migration</sub></td>
<td><a href="tutorials/10-teradata-migration/README.md"><b>10 - Teradata Migration</b></a></td>
<td>Teradata to Fabric modernization</td>
<td><code>~3 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/24-snowflake-to-fabric/README.md"><b>24 - Snowflake to Fabric</b></a></td>
<td>Snowflake migration & cost comparison</td>
<td><code>~3 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/25-ibm-db2-source/README.md"><b>25 - IBM DB2 Source</b></a></td>
<td>DB2 connectivity & CDC patterns</td>
<td><code>~3 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/26-multi-source-streaming/README.md"><b>26 - Multi-Source Streaming</b></a></td>
<td>8 CDC & IoT streaming connectors</td>
<td><code>~3 hours</code></td>
</tr>
<tr>
<td rowspan="5"><b>๐ต Analytics & Expansions</b><br/><sub>Industry verticals</sub></td>
<td><a href="tutorials/27-video-security-analytics/README.md"><b>27 - Video Security</b></a></td>
<td>AI video pipeline & edge processing</td>
<td><code>~2.5 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/28-people-movement-analytics/README.md"><b>28 - People Movement</b></a></td>
<td>Foot traffic & queue detection</td>
<td><code>~2 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/29-geolocation-analytics/README.md"><b>29 - Geolocation Analytics</b></a></td>
<td>H3 indexing & geofencing</td>
<td><code>~2.5 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/30-tribal-healthcare/README.md"><b>30 - Tribal Healthcare</b></a></td>
<td>HIPAA-compliant IHS analytics</td>
<td><code>~3 hours</code></td>
</tr>
<tr>
<td><a href="tutorials/31-federal-dot-faa/README.md"><b>31 - Federal DOT/FAA</b></a></td>
<td>FedRAMP aviation analytics</td>
<td><code>~2.5 hours</code></td>
</tr>
</table>
๐ Documentation Site
This repository includes a full MkDocs Material documentation site with search, dark mode, and comprehensive navigation.
Local Preview
# Install documentation dependencies
pip install -r requirements-docs.txt
# Start local documentation server
mkdocs serve
Then open http://127.0.0.1:8000 in your browser.
Quick References
| Resource | Description |
|---|---|
| โก 5-Minute Quick Start | Fastest path to generating data and exploring the POC |
| ๐ Cheat Sheet | Commands, shortcuts, and quick reference for all components |
Build Documentation
# Build static site
mkdocs build
# Deploy to GitHub Pages
mkdocs gh-deploy
Live Site: Coming soon via GitHub Pages
๐ Compliance Frameworks
This POC addresses regulatory requirements across gaming jurisdictions:
<table> <tr> <td align="center" width="25%"> <h3>๐๏ธ NIGC MICS</h3> <sub>Minimum Internal Control Standards</sub><br/> <sub>Gaming machine & table game controls</sub> </td> <td align="center" width="25%"> <h3>๐ฆ FinCEN BSA</h3> <sub>Bank Secrecy Act</sub><br/> <sub>CTR/SAR reporting thresholds</sub> </td> <td align="center" width="25%"> <h3>๐ณ PCI-DSS</h3> <sub>Payment Card Industry</sub><br/> <sub>Card data security standards</sub> </td> <td align="center" width="25%"> <h3>๐ด State Gaming</h3> <sub>Jurisdiction Requirements</sub><br/> <sub>State-specific regulations</sub> </td> </tr> </table>[!TIP] Phase 7 also addresses HIPAA (Tribal Healthcare), FedRAMP (DOT/FAA), 42 CFR Part 2 (Behavioral Health), and FISMA/NIST 800-53 compliance requirements.
๐๏ธ Completed Expansions
Phase 7 delivered industry expansions beyond the core Casino/Gaming POC:
| Expansion | Compliance | Key Capabilities | Tutorial |
|---|---|---|---|
| ๐พ USDA | NASS, FSIS | Crop production, food safety recalls | Tutorial 32 |
| ๐ผ SBA | PPP, 7(a) | Loan analytics, 20 NAICS codes | Tutorial 33 |
| ๐ NOAA | CDO API | Weather observations, storm events | Tutorial 34 |
| ๐ญ EPA | AirNow, TRI | Air quality (AQI), water quality (MCL) | Tutorial 35 |
| ๐๏ธ DOI | USGS, BLM | Earthquakes, land use management | Tutorial 36 |
| ๐ฅ Tribal Healthcare | HIPAA, 42 CFR | IHS encounters, PHI masking, FHIR | Tutorial 30 |
| โ๏ธ DOT/FAA | FedRAMP, FISMA | Flight ops, safety, carrier analytics | Tutorial 31 |
| โ๏ธ DOJ | FBI NIBRS, USSC | Crime stats, sentencing, antitrust, DEA | Tutorial 38 |
| ๐น Video Analytics | โ | YOLO/DeepSORT, 50 cameras, 8 event types | Tutorial 27 |
| ๐ถ People Movement | โ | 30 zones, queue detection, heat maps | Tutorial 28 |
| ๐ Geolocation | โ | H3 indexing, geofencing, proximity triggers | Tutorial 29 |
๐ค Contributing
We welcome contributions! Please read our Contributing Guide before submitting pull requests.
<table> <tr> <td>Ways to Contribute:
- ๐ Report bugs and issues
- ๐ก Suggest new features
- ๐ Improve documentation
- ๐ง Submit pull requests
Get Started:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
<div align="center" markdown> </div>