Skip to content

princello/pathology-fibrosis-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pathology-fibrosis-pipeline

A computational pathology pipeline for quantifying tissue fibrosis from whole-slide images (WSIs) using foundation model embeddings.

Overview

This pipeline extracts patch-level features from H&E-stained WSIs using vision foundation models (e.g., UNI2-h), clusters tissue morphologies, and computes a fibrosis composite score that integrates:

  • Tissue-level features: cluster proportions from unsupervised patch clustering
  • Cell-level features: nuclear morphometry (eccentricity, area, solidity) via Cellpose segmentation
  • Spatial features: Moran's I, hotspot detection, neighborhood enrichment

The pipeline supports batch correction (ComBat), multi-cohort statistical testing, permutation/bootstrap validation, supervised transfer learning, and cross-modal integration with single-cell RNA-seq data.

Architecture

pathofib/                  # Pip-installable Python package
├── patch_extraction.py    # Tissue segmentation + patching
├── feature_extraction.py  # Foundation model feature extraction
├── clustering.py          # PCA + K-means clustering pipeline
├── cell_analysis.py       # Cellpose segmentation + morphometry
├── spatial_analysis.py    # Moran's I, Gi*, neighborhood enrichment
├── supervised.py          # Annotation-based supervised classification
├── stats.py               # MWU, permutation, bootstrap, effect sizes
├── batch_correction.py    # ComBat wrapper (calls R/sva)
├── visualization.py       # Heatmaps, cluster overlays, plots
├── interpretation.py      # Discriminative patch extraction
├── stain_normalization.py # Macenko/Reinhard normalization
└── config.py              # PipelineConfig dataclass

applications/              # Study-specific implementations
├── mouse_lung_covid/      # SARS-CoV-2 mouse lung study
└── _template/             # Template for new studies

Installation

# Clone
git clone https://github.com/princello/pathology-fibrosis-pipeline.git
cd pathology-fibrosis-pipeline

# Install (editable mode recommended)
pip install -e .

# For R-based ComBat batch correction
# R >= 4.0 with sva package: install.packages("BiocManager"); BiocManager::install("sva")

Dependencies

  • Python >= 3.9
  • PyTorch >= 2.0
  • OpenSlide (system library + openslide-python)
  • Cellpose >= 3.0
  • scikit-learn, scipy, pandas, numpy, matplotlib, seaborn
  • Optional: R + sva (for ComBat batch correction)

Quick Start

from pathofib import PipelineConfig, PatchClusteringPipeline

# Configure
config = PipelineConfig(
    slides_dir="/path/to/svs/files",
    features_dir="/path/to/features",
    output_dir="/path/to/results",
    k_values=[5, 8, 10],
    n_pca=100,
)

# Run clustering
pipeline = PatchClusteringPipeline(config)
pipeline.load_features()
pipeline.fit_pca()
pipeline.cluster()
pipeline.save_results()

See docs/getting_started.md for the full walkthrough and applications/_template/ for adapting the pipeline to your own study.

Adapting for Your Study

  1. Copy the template: cp -r applications/_template/ applications/my_study/
  2. Edit config.py with your paths, cohort definitions, and parameters
  3. Run feature extraction on your WSIs
  4. Execute the pipeline scripts

See applications/_template/README.md for detailed instructions.

Example: Mouse Lung COVID Study

The applications/mouse_lung_covid/ directory contains a complete 9-step analysis of SARS-CoV-2 infection in humanized mouse lungs (67 slides, 307K patches). This study demonstrates the full pipeline including ComBat batch correction, fibrosis quantification, and cross-modal validation with snRNA-seq. See applications/mouse_lung_covid/README.md.

Citation

If you use this pipeline, please cite:

@software{pathology_fibrosis_pipeline,
  author = {Wang, Zicheng},
  title = {pathology-fibrosis-pipeline: Computational Pathology for Tissue Fibrosis Quantification},
  url = {https://github.com/princello/pathology-fibrosis-pipeline}
}

License

MIT License. See LICENSE.

About

Computational pathology pipeline for tissue fibrosis quantification from whole-slide images using foundation model embeddings

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors