scrnametaboliclandscape
Comprehensive metabolic landscape analysis pipeline for scRNA-seq data. This is an all-in-one process group performing complete metabolic pathway analysis including expression imputation, feature selection, pathway activity calculation, and heterogeneity analysis. Based on methodology from Xiao et al.
SKILL.md
| Name | scrnametaboliclandscape |
| Description | Comprehensive metabolic landscape analysis pipeline for scRNA-seq data. This is an all-in-one process group performing complete metabolic pathway analysis including expression imputation, feature selection, pathway activity calculation, and heterogeneity analysis. Based on methodology from Xiao et al. |
name: scrnametaboliclandscape description: Comprehensive metabolic landscape analysis pipeline for scRNA-seq data. This is an all-in-one process group performing complete metabolic pathway analysis including expression imputation, feature selection, pathway activity calculation, and heterogeneity analysis. Based on methodology from Xiao et al.
ScrnaMetabolicLandscape Process Configuration
Purpose
Comprehensive metabolic landscape analysis pipeline for scRNA-seq data. This is an all-in-one process group performing complete metabolic pathway analysis including expression imputation, feature selection, pathway activity calculation, and heterogeneity analysis. Based on methodology from Xiao et al. (2019) Nature Communications.
Key difference from individual Metabolic processes*: ScrnaMetabolicLandscape runs multiple related metabolic analysis steps as a coordinated workflow. Use this for complete metabolic analysis pipeline. Use individual processes (MetabolicInput, MetabolicExprImputation, MetabolicFeatures, MetabolicPathwayActivity, MetabolicPathwayHeterogeneity) for fine-grained control or specific steps only.
When to Use
- Complete metabolic analysis workflow: Need all metabolic landscape analyses (imputation → activity → heterogeneity → features)
- Comprehensive metabolic profiling: Study metabolic heterogeneity across cell clusters or conditions
- Pathway-centric analysis: Have metabolic pathway gene sets (GMT files) and want to explore pathway activity
- Comparative metabolic analysis: Compare metabolic states across different cell types, treatments, or disease conditions
- Alternative to individual Metabolic processes*: Prefer this group for simplicity, individual processes for customization
Configuration Structure
Process Enablement
[ScrnaMetabolicLandscape]
cache = true
Input Specification
[ScrnaMetabolicLandscape.in]
srtobj = ["SeuratClustering"] # Input from upstream clustering process
Note: Input is automatically wired from CombinedInput. The metafile argument (used in standalone biopipen) is set via pipeline configuration.
Environment Variables
[ScrnaMetabolicLandscape.envs]
# Core configuration
gmtfile = "path/to/metabolic_pathways.gmt" # Required: GMT file with metabolic pathways
group_by = "seurat_clusters" # Required: Column to group cells (e.g., "cluster", "seurat_clusters")
subset_by = "treatment" # Optional: Subset data by metadata column
# Imputation settings
noimpute = false # Skip imputation if set to true
# Metadata transformations
mutaters = {} # dict - Add new columns using R expressions
# Example: {"timepoint": "if_else(treatment == 'control', 'pre', 'post')"}
# Performance
ncores = 1 # Number of cores for parallelization (inherited by all sub-processes)
GMT file sources:
- Bader Lab: https://download.baderlab.org/EM_Genesets/current_release/Human/symbol/
- Original repository: https://github.com/LocasaleLab/Single-Cell-Metabolic-Landscape/blob/master/Data/KEGG_metabolism.gmt
- Custom GMT files can be used - ensure gene symbols match your Seurat object
Individual Process Configuration
MetabolicExprImputation (Dropout Imputation)
[ScrnaMetabolicLandscape.MetabolicExprImputation.envs]
tool = "alra" # Choice: "alra", "scimpute", "rmagic"
alra_args = {} # Additional RunALRA() parameters
Imputation tools: alra (fast, recommended), scimpute (accurate, slow), rmagic (diffusion-based).
MetabolicFeatures (Pathway Enrichment)
[ScrnaMetabolicLandscape.MetabolicFeatures.envs]
prerank_method = "signal_to_noise" # Gene ranking: signal_to_noise, abs_signal_to_noise, t_test, ratio_of_classes, diff_of_classes, log2_ratio_of_classes
comparisons = [] # Specific group comparisons (empty = all pairwise)
fgsea_args = {} # Additional fgsea parameters: { "minSize": 15, "maxSize": 500 }
MetabolicPathwayActivity (Pathway Scores)
[ScrnaMetabolicLandscape.MetabolicPathwayActivity.envs]
ntimes = 5000 # Number of permutations for p-value estimation
MetabolicPathwayHeterogeneity (Heterogeneity Analysis)
[ScrnaMetabolicLandscape.MetabolicPathwayHeterogeneity.envs]
select_pcs = 0.8 # Proportion of variance to select PCs
pathway_pval_cutoff = 0.01 # P-value cutoff for enriched pathways
fgsea_args = { scoreType = "std", nproc = 1 }
Configuration Examples
Minimal Configuration
[ScrnaMetabolicLandscape]
[ScrnaMetabolicLandscape.in]
srtobj = ["SeuratClustering"]
[ScrnaMetabolicLandscape.envs]
gmtfile = "pathways/KEGG_metabolism.gmt"
group_by = "seurat_clusters"
Complete Metabolic Analysis
[ScrnaMetabolicLandscape]
cache = true
[ScrnaMetabolicLandscape.in]
srtobj = ["SeuratClustering"]
[ScrnaMetabolicLandscape.envs]
gmtfile = "https://download.baderlab.org/EM_Genesets/current_release/Human/symbol/KEGG_2021_Human_symbol.gmt"
group_by = "seurat_clusters"
subset_by = "treatment"
mutaters = { "timepoint" = "if_else(treatment == 'control', 'pre', 'post')" }
ncores = 4
noimpute = false
[ScrnaMetabolicLandscape.MetabolicExprImputation.envs]
tool = "alra"
[ScrnaMetabolicLandscape.MetabolicPathwayActivity.envs]
ntimes = 10000
[ScrnaMetabolicLandscape.MetabolicFeatures.envs]
prerank_method = "log2_ratio_of_classes"
fgsea_args = { minSize = 15, maxSize = 500 }
comparisons = ["0", "1"]
Multiple Analysis Cases
[ScrnaMetabolicLandscape]
[ScrnaMetabolicLandscape.envs]
gmtfile = "pathways/KEGG_metabolism.gmt"
group_by = "seurat_clusters"
# Case 1: Treatment comparison
[ScrnaMetabolicLandscape.MetabolicPathwayActivity.envs.cases.Treatment]
subset_by = "treatment"
group_by = "seurat_clusters"
# Case 2: Response comparison
[ScrnaMetabolicLandscape.MetabolicFeatures.envs.cases.Response]
subset_by = "response"
group_by = "seurat_clusters"
prerank_method = "signal_to_noise"
Common Patterns
Pattern 1: Standard Metabolic Workflow
All metabolic analysis steps with minimal customization:
[ScrnaMetabolicLandscape]
[ScrnaMetabolicLandscape.in]
srtobj = ["SeuratClustering"]
[ScrnaMetabolicLandscape.envs]
gmtfile = "pathways/KEGG_metabolism.gmt"
group_by = "seurat_clusters"
ncores = 4
Pattern 2: Focused Pathway Analysis
Compare only specific groups:
[ScrnaMetabolicLandscape]
[ScrnaMetabolicLandscape.envs]
gmtfile = "pathways/KEGG_metabolism.gmt"
group_by = "seurat_clusters"
[ScrnaMetabolicLandscape.MetabolicFeatures.envs]
comparisons = ["0", "1"] # Compare cluster 0 and 1 against others
Pattern 3: Skip Imputation
When you don't want to impute dropout values:
[ScrnaMetabolicLandscape]
[ScrnaMetabolicLandscape.envs]
gmtfile = "pathways/KEGG_metabolism.gmt"
group_by = "seurat_clusters"
noimpute = true
Metabolic Analysis Pipeline Steps
- MetabolicInput: Passes Seurat object to downstream processes
- MetabolicExprImputation: Imputes missing expression values (ALRA/scImpute/MAGIC)
- MetabolicPathwayActivity: Calculates pathway activity scores with heatmaps and violin plots
- MetabolicPathwayHeterogeneity: Analyzes pathway heterogeneity using permutation-based NES
- MetabolicFeatures: Detailed GSEA enrichment analysis with summary and enrichment plots
Dependencies
- Upstream:
CombinedInput(requiresSeuratClusteringor equivalent) - Downstream: None (terminal analysis group)
- Data requirements: Seurat object with normalized expression and metadata columns
Validation Rules
- GMT file: Must be valid GMT format with gene symbols matching Seurat object gene names
- group_by column: Must exist in Seurat object metadata
- subset_by column (if specified): Must exist in Seurat object metadata, NA values will be removed
- Expression data: Seurat object must have normalized expression data (typically after SeuratClustering)
Troubleshooting
Issue: Gene name mismatch in GMT file
Symptom: No pathways enriched or warning about missing genes Solution: Ensure GMT file uses same gene identifier type as your Seurat object (e.g., HGNC symbols for human, MGI symbols for mouse).
Issue: Imputation takes too long
Symptom: MetabolicExprImputation process runs for hours
Solution: Use tool = "alra" (fastest) or skip imputation with noimpute = true.
Issue: No significant pathways
Symptom: All pathways have high p-values or no enrichment
Solution: Check fgsea_args (adjust minSize/maxSize), try different prerank_method, verify group_by column has sufficient differences.
Issue: Out of memory errors
Symptom: Process fails during permutation or GSEA
Solution: Reduce ntimes (default 5000 → 1000) or reduce ncores to limit parallel memory usage.
Issue: Subset has no cells after filtering
Symptom: Warning about empty subsets or missing groups
Solution: Check subset_by column for NA values or mismatched categories. Use mutaters to clean metadata.
External References
Original Paper
Xiao, Zhengtao, Ziwei Dai, and Jason W. Locasale. "Metabolic landscape of tumor microenvironment at single cell resolution." Nature communications 10.1 (2019): 1-12. https://www.nature.com/articles/s41467-019-11738-0
GMT File Sources
- Bader Lab: https://download.baderlab.org/EM_Genesets/current_release/Human/symbol/
- KEGG pathways: https://www.genome.jp/kegg/
- Reactome: https://reactome.org/
- GSEA MSigDB: https://www.gsea-msigdb.org/gsea/msigdb/
Tools
- fgsea: https://rdrr.io/bioc/fgsea/man/fgsea.html - Fast preranked GSEA
- Seurat ALRA: https://satijalab.org/seurat/reference/runalra - Low-rank approximation imputation
- scImpute: https://github.com/vvnathan/scImpute - Cell-specific imputation
- MAGIC: https://github.com/KrishnaswamyLab/MAGIC - Diffusion-based imputation
biopipen Documentation
- Scrna metabolic pipeline: https://pwwang.github.io/biopipen/pipelines/scrna_metabolic/
- Process API: https://pwwang.github.io/biopipen/api/biopipen.ns.scrna_metabolic_landscape/
- Plotthis: https://pwwang.github.io/plotthis/reference/