Agent Skill
2/7/2026

lifesciences-graph-builder

Orchestrates life sciences APIs to build knowledge graphs using the Fuzzy-to-Fact protocol, combining MCPs for nodes and curl for edges, then persisting to Graphiti. This skill should be used when the user asks to "build knowledge graphs", "find biological connections", "explore drug repurposing", "validate drug targets", or mentions traversing gene→protein→pathway→drug→disease paths, multi-API orchestration, or graph persistence workflows.

D
donbr
4GitHub Stars
1Views
npx skills add donbr/lifesciences-research

SKILL.md

Namelifesciences-graph-builder
DescriptionOrchestrates life sciences APIs to build knowledge graphs using the Fuzzy-to-Fact protocol, combining MCPs for nodes and curl for edges, then persisting to Graphiti. This skill should be used when the user asks to "build knowledge graphs", "find biological connections", "explore drug repurposing", "validate drug targets", or mentions traversing gene→protein→pathway→drug→disease paths, multi-API orchestration, or graph persistence workflows.

name: lifesciences-graph-builder description: "Orchestrates life sciences APIs to build knowledge graphs using the Fuzzy-to-Fact protocol, combining MCPs for nodes and curl for edges, then persisting to Graphiti. This skill should be used when the user asks to "build knowledge graphs", "find biological connections", "explore drug repurposing", "validate drug targets", or mentions traversing gene→protein→pathway→drug→disease paths, multi-API orchestration, or graph persistence workflows."

Life Sciences Graph Builder

Orchestrate multi-API graph construction using the Fuzzy-to-Fact protocol.

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                         GRAPH CONSTRUCTION KIT                          │
├─────────────────────────────────────────────────────────────────────────┤
│  TIER 1: MCP TOOLS (Verified Nodes)                                     │
│  ├── HGNC: search_genes, get_gene                                       │
│  ├── UniProt: search_proteins, get_protein                              │
│  ├── ChEMBL: search_compounds, get_compound                             │
│  ├── STRING: search_proteins, get_interactions                          │
│  ├── Open Targets: search_targets, get_associations                     │
│  └── WikiPathways: get_pathways_for_gene, get_pathway_components        │
├─────────────────────────────────────────────────────────────────────────┤
│  TIER 2: CURL COMMANDS (Relationship Edges)                             │
│  ├── ChEMBL /mechanism: Drug → Target                                   │
│  ├── ChEMBL /drug_indication: Drug → Disease                            │
│  ├── ChEMBL /activity: Drug → Target (with Ki/IC50)                     │
│  ├── Ensembl /homology: Gene → Orthologs                                │
│  ├── STRING /enrichment: Protein Set → GO/KEGG terms                    │
│  └── NCBI elink: Gene → PubMed                                          │
├─────────────────────────────────────────────────────────────────────────┤
│  TIER 3: GRAPHITI (Persistence)                                         │
│  └── add_memory: Persist validated subgraph as JSON episode             │
└─────────────────────────────────────────────────────────────────────────┘

Workflow: Fuzzy-to-Fact Protocol

Phase 1: Anchor Node (Naming)

Resolve fuzzy user input to canonical identifier.

# MCP: HGNC
result = hgnc.search_genes("p53")
gene = hgnc.get_gene("HGNC:11998")  # → cross_references: UniProt, Ensembl, Entrez

Phase 2: Enrich Node (Functional)

Decorate node with metadata and cross-references.

# MCP: UniProt
protein = uniprot.get_protein("UniProtKB:P04637")
# → function text reveals interactors: BAX, BCL2, FAS

Phase 3: Expand Edges (Interactions)

Build adjacency list from interaction databases.

# MCP: STRING
interactions = string.get_interactions("STRING:9606.ENSP00000269305")
# → MDM2 (0.999), SIRT1 (0.999), ATM (0.995)
# Curl: Open Targets (gene-disease)
curl -s -X POST "https://api.platform.opentargets.org/api/v4/graphql" \
  -H "Content-Type: application/json" \
  -d '{"query": "{ target(ensemblId: \"ENSG00000141510\") { associatedDiseases(page: {size: 5}) { rows { disease { name } score } } } }"}'

Phase 4: Target Traversal (Pharma)

Follow edges to actionable targets.

# MCP: HGNC (resolve downstream effector)
bcl2 = hgnc.search_genes("BCL2")  # → HGNC:990

# MCP: ChEMBL (find inhibitors)
venetoclax = chembl.search_compounds("Venetoclax")  # → CHEMBL:3137309
# Curl: ChEMBL mechanism (Drug → Target edge)
curl -s "https://www.ebi.ac.uk/chembl/api/data/mechanism?molecule_chembl_id=CHEMBL3137309&format=json" \
  | jq '.mechanisms[] | {action: .action_type, target: .target_chembl_id}'
# → INHIBITOR → CHEMBL4860 (BCL2)

Phase 5: Persist Graph

Store validated subgraph in Graphiti.

# MCP: Graphiti
graphiti.add_memory(
    name="TP53-BCL2-Venetoclax pathway",
    episode_body=json.dumps({
        "nodes": [
            {"id": "HGNC:11998", "type": "Gene", "symbol": "TP53"},
            {"id": "HGNC:990", "type": "Gene", "symbol": "BCL2"},
            {"id": "CHEMBL:3137309", "type": "Compound", "name": "Venetoclax"}
        ],
        "edges": [
            {"source": "HGNC:11998", "target": "HGNC:990", "type": "REGULATES"},
            {"source": "CHEMBL:3137309", "target": "HGNC:990", "type": "INHIBITOR"}
        ]
    }),
    source="json",
    group_id="drug-repurposing"
)

Quick Edge Discovery Commands

Edge TypeCurl Command
Drug → Targetcurl -s "https://www.ebi.ac.uk/chembl/api/data/mechanism?molecule_chembl_id={ID}&format=json"
Target → Drugscurl -s "https://www.ebi.ac.uk/chembl/api/data/mechanism?target_chembl_id={ID}&format=json"
Drug → Diseasecurl -s "https://www.ebi.ac.uk/chembl/api/data/drug_indication?molecule_chembl_id={ID}&format=json"
Gene → DiseaseOpen Targets GraphQL (see Phase 3)
Gene → Orthologscurl -s "https://rest.ensembl.org/homology/id/human/{ENSG}?type=orthologues&content-type=application/json"
Protein Set → GOcurl -s "https://string-db.org/api/json/enrichment?identifiers={IDs}&species=9606"
Gene → PubMedcurl -s "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=gene&db=pubmed&id={ID}&retmode=json"

Example: Drug Repurposing Graph

Build a complete subgraph for drug repurposing analysis:

# Step 1: Anchor - Resolve gene
# MCP: hgnc.search_genes("TP53") → HGNC:11998

# Step 2: Get protein context
# MCP: uniprot.get_protein("UniProtKB:P04637")
# → function mentions BCL2

# Step 3: Find BCL2 inhibitors
curl -s "https://www.ebi.ac.uk/chembl/api/data/mechanism?target_chembl_id=CHEMBL4860&format=json" \
  | jq '.mechanisms[] | {drug: .molecule_chembl_id, action: .action_type}'

# Step 4: Get drug indications
curl -s "https://www.ebi.ac.uk/chembl/api/data/drug_indication?molecule_chembl_id=CHEMBL3137309&format=json" \
  | jq '.drug_indications[:3][] | {disease: .mesh_heading, phase: .max_phase_for_ind}'

# Step 5: Find clinical trials
curl -s "https://clinicaltrials.gov/api/v2/studies?query.intr=venetoclax&filter.overallStatus=RECRUITING&pageSize=3&format=json" \
  | jq '.studies[] | {nct: .protocolSection.identificationModule.nctId}'

# Step 6: Persist to Graphiti
# MCP: graphiti.add_memory(...)

Node Types (Canonical CURIEs)

TypeCURIE PatternExample
GeneHGNC:\d+HGNC:11998
ProteinUniProtKB:[A-Z0-9]+UniProtKB:P04637
CompoundCHEMBL:\d+CHEMBL:3137309
TargetCHEMBL:\d+CHEMBL:4860
DiseaseEFO_\d+ or MONDO_\d+EFO_0000574
PathwayWP:WP\d+WP:WP1742
TrialNCT:\d+NCT:00461032

Edge Types

EdgeSourceTargetProperties
ENCODESGeneProtein-
REGULATESGeneGenedirection: activation/repression
INTERACTSProteinProteinscore, evidence_type
INHIBITORCompoundTargetKi, IC50
AGONISTCompoundTargetEC50
TREATSCompoundDiseasemax_phase
ASSOCIATED_WITHGeneDiseasescore, evidence_sources
MEMBER_OFGenePathway-

Query Best Practices

Gene Discovery (Human-Centric)

  • Default to species=9606 (human) for gene/protein searches
  • Use page_size=10 for exploration, page_size=50 for batch operations
  • Use slim=True for batch operations to reduce token usage
  • Only use organism=null for comparative genomics across species

Drug Discovery vs Repurposing

  • Drug repurposing: Use max_phase≥2 (clinical validation, shorter approval path)
  • General discovery: No phase filter (include preclinical tools, mechanism probes)
  • Check mechanisms before bioactivity data

Clinical Landscape

  • Default status=RECRUITING for active research
  • Use phase filter only for specific analysis:
    • PHASE3+ for commercialization analysis
    • PHASE1/2 for early pipeline
    • No filter for full landscape

See Also

  • lifesciences-genomics: Ensembl, NCBI, HGNC endpoints
  • lifesciences-proteomics: UniProt, STRING, BioGRID endpoints
  • lifesciences-pharmacology: ChEMBL, PubChem, IUPHAR endpoints
  • lifesciences-clinical: Open Targets, ClinicalTrials.gov endpoints
Skills Info
Original Name:lifesciences-graph-builderAuthor:donbr