Agent Skill
2/7/2026

healthsim-populationsim

PopulationSim provides population-level intelligence using public data sources. Use this skill for ANY request involving: (1) population demographics or profiles, (2) geographic health patterns or disparities, (3) social determinants of health (SDOH), (4) SVI or ADI analysis, (5) cohort definition or specification, (6) clinical trial feasibility, site selection, or enrollment projection, (7) service area analysis, (8) health equity assessment, (9) census data or ACS variables, (10) CDC PLACES health indicators.

M
mark64oswald
6GitHub Stars
1Views
npx skills add mark64oswald/healthsim-workspace

SKILL.md

Namehealthsim-populationsim
DescriptionPopulationSim provides population-level intelligence using public data sources. Use this skill for ANY request involving: (1) population demographics or profiles, (2) geographic health patterns or disparities, (3) social determinants of health (SDOH), (4) SVI or ADI analysis, (5) cohort definition or specification, (6) clinical trial feasibility, site selection, or enrollment projection, (7) service area analysis, (8) health equity assessment, (9) census data or ACS variables, (10) CDC PLACES health indicators.

name: populationsim description: > PopulationSim provides population-level intelligence using public data sources. Use this skill for ANY request involving: (1) population demographics or profiles, (2) geographic health patterns or disparities, (3) social determinants of health (SDOH), (4) SVI or ADI analysis, (5) cohort definition or specification, (6) clinical trial feasibility, site selection, or enrollment projection, (7) service area analysis, (8) health equity assessment, (9) census data or ACS variables, (10) CDC PLACES health indicators.

PopulationSim - Population Intelligence & Cohort Generation

Overview

PopulationSim provides population-level intelligence using public data (Census ACS, CDC PLACES, SVI, ADI) for:

  1. Standalone Analysis: Geographic profiling, health disparities, population comparisons
  2. Cross-Product Integration: Cohort specs driving generation in PatientSim, MemberSim, RxMemberSim, TrialSim

Key Differentiator: PopulationSim analyzes real population characteristics and creates specifications — it does not generate synthetic records itself.

Quick Reference

I want to...Use This SkillKey Triggers
Data Access (v2.0)
Look up exact data valuesdata-access/data-lookup.md"what is the exact", "look up", "from PLACES"
Resolve FIPS codesdata-access/geography-lookup.md"FIPS for", "which county is", "list counties in MSA"
Aggregate geographic datadata-access/data-aggregation.md"aggregate tracts", "metro total", "combine counties"
Geographic Intelligence
Profile a county or regiongeographic/county-profile.md"county profile", "demographics for", "health indicators"
Analyze census tractsgeographic/census-tract-analysis.md"tract level", "granular", "hotspots"
Profile a metro areageographic/metro-area-profile.md"metro", "MSA", "metropolitan"
Define custom regiongeographic/custom-region-builder.md"service area", "combine", "custom region"
Health Patterns
Analyze disease prevalencehealth-patterns/chronic-disease-prevalence.md"diabetes rate", "prevalence", "CDC PLACES"
Analyze health behaviorshealth-patterns/health-behavior-patterns.md"smoking rate", "obesity", "physical activity"
Assess healthcare accesshealth-patterns/healthcare-access-analysis.md"uninsured", "provider ratio", "access"
Identify health disparitieshealth-patterns/health-outcome-disparities.md"disparities", "equity", "by race"
SDOH Analysis
Analyze SVIsdoh/svi-analysis.md"SVI", "social vulnerability", "vulnerable"
Analyze ADIsdoh/adi-analysis.md"ADI", "area deprivation", "deprived"
Analyze economicssdoh/economic-indicators.md"poverty", "income", "unemployment"
Analyze community factorssdoh/community-factors.md"housing", "transportation", "food access"
Cohort Definition
Define a cohortcohorts/cohort-specification.md"define cohort", "cohort spec", "population segment"
Build demographicscohorts/demographic-distribution.md"age distribution", "demographics for cohort"
Build clinical profilecohorts/clinical-prevalence-profile.md"comorbidity rates", "clinical profile"
Build SDOH profilecohorts/sdoh-profile-builder.md"SDOH profile", "Z-code rates"
Trial Support
Estimate trial feasibilitytrial-support/feasibility-estimation.md"feasibility", "eligible population"
Select trial sitestrial-support/site-selection-support.md"site selection", "best locations"
Project enrollmenttrial-support/enrollment-projection.md"enrollment timeline", "recruitment rate"

Safety Guardrails

All Generated Data is Synthetic

PopulationSim outputs synthetic, fictional, simulated data — never real patient records. All profiles and cohort specs are derived from aggregated public statistics and must not be treated as real patient data.

Do NOT:

  • Present synthetic data as actual patient records
  • Make clinical recommendations (e.g., "you should prescribe," "the patient needs") based on generated data
  • Pull from or reference real patient databases — use only public reference data

Do:

  • Remind users that all generated data is synthetic test data
  • Use real, valid medical code systems for standards: ICD-10 (diagnoses), CPT/HCPCS (procedures), LOINC (labs), SNOMED (clinical terms), RxNorm/NDC (medications), NPI (providers)
  • Use real public reference data (Census ACS, CDC PLACES, SVI, ADI) for population characteristics

Negative Examples — What PopulationSim Must NOT Do

ScenarioWrong ResponseCorrect Response
"What should this patient take?""I recommend starting them on metformin""This is synthetic test data; PopulationSim does not provide clinical recommendations."
"Generate individual patient records"Emit named patient rowsRoute to PatientSim — PopulationSim produces population-level profiles and cohort specs, not individual records
"Show me real patient data from the database"Pull from a patient database"All PopulationSim data is synthetic. Real reference data (Census, PLACES) is population-level only."
Population prevalence as individual risk"This patient has a 28% chance of obesity""The county obesity prevalence is 28.0% (CDC PLACES 2024)" — population rates are not individual probabilities

Edge Cases

  • Missing FIPS: Validate inputs; return clear error if FIPS not found in crosswalk files
  • Partial data: Some tracts lack PLACES or SVI coverage — flag gaps rather than imputing zeros
  • Invalid codes: Only emit ICD-10, CPT, LOINC, RxNorm, NDC codes from recognized systems

Output Types

PopulationProfile

Geographic entity with demographics, health indicators, and SDOH indices:

{
  "geography": {
    "type": "county",
    "fips": "06073",
    "name": "San Diego County",
    "state": "CA",
    "region": "Pacific"
  },
  "demographics": {
    "total_population": 3286069,
    "median_age": 37.1,
    "age_distribution": {
      "0-17": 0.21,
      "18-64": 0.62,
      "65+": 0.17
    },
    "race_ethnicity": {
      "white_nh": 0.43,
      "hispanic": 0.34,
      "asian": 0.12,
      "black": 0.05,
      "other": 0.06
    },
    "median_household_income": 102285,
    "poverty_rate": 0.103
  },
  "health_indicators": {
    "source": "CDC_PLACES_2024",
    "diabetes_prevalence": 0.095,
    "obesity_prevalence": 0.280,
    "hypertension_prevalence": 0.285,
    "depression_prevalence": 0.195,
    "smoking_prevalence": 0.098
  },
  "sdoh_indices": {
    "svi_overall": 0.42,
    "svi_themes": {
      "socioeconomic": 0.38,
      "household_composition": 0.45,
      "minority_language": 0.52,
      "housing_transportation": 0.35
    },
    "adi_national_rank": 35
  },
  "healthcare_access": {
    "uninsured_rate": 0.071,
    "pcp_per_100k": 82.4,
    "insurance_mix": {
      "employer": 0.52,
      "medicare": 0.15,
      "medicaid": 0.18,
      "individual": 0.08,
      "uninsured": 0.07
    }
  }
}

CohortSpecification

Generation input for other HealthSim products:

{
  "cohort_id": "houston_diabetics_2024",
  "name": "Houston Metro Diabetic Adults",
  "target_size": 10000,
  "geography": {
    "type": "msa",
    "cbsa_code": "26420",
    "name": "Houston-The Woodlands-Sugar Land, TX"
  },
  "demographics": {
    "age": {
      "min": 18, "max": 85, "mean": 58.4,
      "brackets": { "18-44": 0.18, "45-64": 0.42, "65-74": 0.28, "75+": 0.12 }
    },
    "sex": { "male": 0.48, "female": 0.52 },
    "race_ethnicity": { "white_nh": 0.28, "black": 0.22, "hispanic": 0.38, "asian": 0.08 }
  },
  "clinical_profile": {
    "primary_condition": { "icd10": "E11", "name": "Type 2 Diabetes" },
    "comorbidities": {
      "I10": { "name": "Hypertension", "rate": 0.71 },
      "E78": { "name": "Hyperlipidemia", "rate": 0.68 },
      "E66": { "name": "Obesity", "rate": 0.62 }
    }
  },
  "sdoh_profile": {
    "poverty_rate": 0.18,
    "uninsured_rate": 0.16,
    "food_insecurity": 0.15,
    "svi_mean": 0.58
  },
  "z_code_rates": {
    "Z59.6": { "name": "Low income", "rate": 0.18 },
    "Z59.41": { "name": "Food insecurity", "rate": 0.15 }
  },
  "insurance_mix": {
    "medicare": 0.38, "medicaid": 0.22, "commercial": 0.32, "uninsured": 0.08
  }
}

Cross-Product Integration

Integration Flow

                    ┌─────────────────────┐
                    │   PopulationSim     │
                    │  CohortSpecification│
                    └──────────┬──────────┘
                               │
           ┌───────────────────┼───────────────────┐
           │                   │                   │
           ▼                   ▼                   ▼
    ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
    │ PatientSim  │     │ MemberSim   │     │  TrialSim   │
    │ - patients  │     │ - members   │     │ - subjects  │
    │ - diagnoses │     │ - claims    │     │ - diversity │
    │ - SDOH codes│     │ - plans     │     │ - sites     │
    └──────┬──────┘     └──────┬──────┘     └─────────────┘
           │                   │
           └─────────┬─────────┘
                     ▼
              ┌─────────────┐
              │ RxMemberSim │
              │ - Rx claims │
              │ - adherence │
              └─────────────┘

Integration Patterns

OutputReceiverResult
CohortSpecificationPatientSimPatients matching demographic/clinical profile
CohortSpecificationMemberSimMembers with realistic plan/utilization mix
CohortSpecificationTrialSimDiverse trial subjects meeting FDA guidance
PopulationProfileNetworkSimService area provider network design

Data Sources

Reference data (100% US coverage) accessible via healthsim MCP tools:

SourceTableRecordsData Year
CDC PLACES (County)population.places_county (via healthsim_query_reference)3,1432022 BRFSS
CDC PLACES (Tract)population.places_tract (via healthsim_query_reference)83,5222022 BRFSS
SVI (County)population.svi_county (via healthsim_query_reference)3,1442018-2022 ACS
SVI (Tract)population.svi_tract (via healthsim_query_reference)84,1202018-2022 ACS
ADI (Block Group)population.adi_blockgroup (via healthsim_query_reference)242,3362019-2023 ACS
Geography Crosswalksgeography crosswalks (via healthsim_query)Various2023 Census

DuckDB Reference Tables

Reference data also available in DuckDB:

TableSourcePurpose
population.places_tractCDC PLACESTract-level health indicators
population.places_countyCDC PLACESCounty-level health indicators
population.svi_tractCDC SVITract-level vulnerability
population.svi_countyCDC SVICounty-level vulnerability
population.adi_blockgroupADIBlock group deprivation

See Data Architecture for details.

Directory Structure

skills/populationsim/
├── SKILL.md                           # This file - master router
├── README.md                          # Product overview
├── population-intelligence-domain.md  # Core domain knowledge
│
├── data/                              # Embedded Data Package (v2.0)
│   ├── README.md                      # Data dictionary
│   ├── county/                        # County-level files
│   ├── tract/                         # Tract-level files
│   ├── block_group/                   # Block group files (ADI)
│   └── crosswalks/                    # FIPS and CBSA mappings
│
├── data-access/                       # Data Access Skills (v2.0)
│   ├── README.md                      # Category overview
│   ├── data-lookup.md                 # Direct value lookups
│   ├── geography-lookup.md            # FIPS code resolution
│   └── data-aggregation.md            # Geographic aggregation
│
├── geographic/                        # Geographic Intelligence
│   ├── README.md                      # Category overview
│   ├── county-profile.md              # County-level profiles
│   ├── census-tract-analysis.md       # Tract-level analysis
│   ├── metro-area-profile.md          # MSA/CBSA profiles
│   └── custom-region-builder.md       # Custom region aggregation
│
├── health-patterns/                   # Health Analysis
│   ├── README.md                      # Category overview
│   ├── chronic-disease-prevalence.md  # Disease burden analysis
│   ├── health-behavior-patterns.md    # Risk factor analysis
│   ├── healthcare-access-analysis.md  # Coverage and access
│   └── health-outcome-disparities.md  # Disparity analysis
│
├── sdoh/                              # Social Determinants
│   ├── README.md                      # SDOH framework overview
│   ├── svi-analysis.md                # Social Vulnerability Index
│   ├── adi-analysis.md                # Area Deprivation Index
│   ├── economic-indicators.md         # Income, poverty, employment
│   └── community-factors.md           # Housing, transportation, food
│
├── cohorts/                           # Cohort Definition
│   ├── README.md                      # CohortSpecification schema
│   ├── cohort-specification.md        # Complete cohort definition
│   ├── demographic-distribution.md    # Age, sex, race distributions
│   ├── clinical-prevalence-profile.md # Comorbidity patterns
│   └── sdoh-profile-builder.md        # SDOH and Z-code rates
│
└── trial-support/                     # Clinical Trial Support
    ├── README.md                      # Trial integration overview
    ├── feasibility-estimation.md      # Eligible population modeling
    ├── site-selection-support.md      # Site network optimization
    └── enrollment-projection.md       # Timeline projections

Quick Examples

Example 1: County Health Profile

Request: "Profile San Diego County health indicators"

Response:

San Diego County, CA (FIPS 06073) Population Profile:

Demographics (ACS 2022 5-year):

  • Total Population: 3,286,069
  • Median Age: 37.1 years
  • Race/Ethnicity: 43% White NH, 34% Hispanic, 12% Asian, 5% Black

Health Indicators (CDC PLACES 2024):

  • Diabetes: 9.5% (vs 10.1% national)
  • Obesity: 28.0% (vs 32.1% national)
  • Hypertension: 28.5% (vs 32.4% national)

SDOH Indices:

  • SVI Overall: 0.42 (moderate vulnerability)
  • ADI National Rank: 35th percentile

Example 2: Disparities by SVI Quartile

Request: "How do health outcomes vary across SVI quartiles?"

Response: Stratification by SVI quartile shows clear outcome variation:

SVI QuartileDiabetesObesityUninsured
Q1 (lowest vulnerability)8.2%26%5.1%
Q4 (highest vulnerability)13.8%38%14.2%

Disparity: Q4 tracts show higher prevalence vs Q1 across all indicators; lower-vulnerability areas have better outcomes. See health-patterns/health-outcome-disparities.md.

Example 3: Cohort for PatientSim

Request: "Define a cohort of high-risk diabetics in underserved California"

Response: Creates CohortSpecification with:

  • Geography: CA census tracts with SVI ≥ 0.70
  • Demographics: Hispanic 58%, mean age 58.4
  • Comorbidities: HTN 71%, obesity 62%
  • SDOH: Rx cost barrier 31%, food insecurity 22%

Example 4: Trial Feasibility

Request: "Feasibility for T2DM trial: age 40-70, HbA1c 8-11%"

Response:

StagePopulationConversion
T2DM Prevalent34.2M-
Age 40-7024.8M72.5%
HbA1c 8-11%7.4M29.8%
After exclusions4.2M-

Top Metros: Houston (128K), Miami (115K), Los Angeles (108K)

Related Products

Domain Knowledge

See Population Intelligence Domain for geographic hierarchy, census data, and SDOH frameworks.


Generative Framework Integration

PopulationSim feeds the Generative Framework via CohortSpecifications that drive synthetic generation in PatientSim, MemberSim, and TrialSim.

Skills Info
Original Name:healthsim-populationsimAuthor:mark64oswald