Agent Skill
2/7/2026

bio-annotation

Functional annotation and taxonomy inference from sequence homology.

F
fmschulz
0GitHub Stars
1Views
npx skills add fmschulz/omics-skills

SKILL.md

Namebio-annotation
DescriptionFunctional annotation and taxonomy inference from sequence homology.

name: bio-annotation description: Functional annotation and taxonomy inference from sequence homology.

Bio Annotation

Functional annotation and taxonomy inference from sequence homology.

Instructions

  1. Run InterProScan for domain/family annotation.
  2. Run eggnog-mapper for orthology-based annotation.
  3. Run DIAMOND and resolve taxonomy with TaxonKit.

Quick Reference

TaskAction
Run workflowFollow the steps in this skill and capture outputs.
Validate inputsConfirm required inputs and reference data exist.
Review outputsInspect reports and QC gates before proceeding.
Tool docsSee docs/README.md.
References- See ../bio-skills-references.md

Input Requirements

Prerequisites:

  • Tools available in the active environment (Pixi/conda/system). See docs/README.md for expected tools.
  • Reference DB root: set BIO_DB_ROOT (default /media/shared-expansion/db/ on WSU).
  • Input FASTA and reference DBs are readable. Inputs:
  • proteins.faa (FASTA protein sequences).
  • reference_db/ (eggNOG, InterPro, DIAMOND databases + taxdump).

Output

  • results/bio-annotation/annotations.parquet
  • results/bio-annotation/taxonomy.parquet
  • results/bio-annotation/annotation_report.md
  • results/bio-annotation/logs/

Quality Gates

  • Annotation hit rate and taxonomy rank coverage meet project thresholds.
  • On failure: retry with alternative parameters; if still failing, record in report and exit non-zero.
  • Verify proteins.faa is non-empty and amino acid encoded.
  • Verify required reference DBs exist under the reference root.

Examples

Example 1: Expected input layout

proteins.faa (FASTA protein sequences).
reference_db/ (eggNOG, InterPro, DIAMOND databases + taxdump).

Troubleshooting

Issue: Missing inputs or reference databases Solution: Verify paths and permissions before running the workflow.

Issue: Low-quality results or failed QC gates Solution: Review reports, adjust parameters, and re-run the affected step.

Skills Info
Original Name:bio-annotationAuthor:fmschulz