Projects
Computational methods for spatial biology and multimodal integration
I develop computational approaches to characterize cell state and decode its drivers. By placing otherwise incompatible assays into a shared anatomical-statistical space, my models can: describe cells and their neighborhoods in intact tissue; learn what drives cellular decisions by extracting causal signals; and design experiments that close the loop from data to action.
USHER
Transforming Foundation Model Representations for Out-of-Distribution Data
Foundation models like scGPT hold the promise of unifying data across labs, assays, and platforms, yet their representations remain fragile: even modest shifts in protocol, instrumentation, or imaging conditions can distort embeddings in ways that obscure biological structure. This instability is one of the most underappreciated barriers to deploying large pretrained models in biology.
USHER performs embedding-space alignment using fused Gromov-Wasserstein optimal transport, without retraining or modifying foundation model weights. It removes artifactual variation while preserving underlying biological structure, enabling reliable reuse of pretrained models across new datasets, sites, and technologies.
By treating FM deployment as a domain-adaptation problem rather than a retraining problem, USHER offers a scalable, interpretable, and modality-agnostic framework that makes foundation models genuinely interoperable across labs.
Read the preprint (Accepted at RECOMB’26)
SAME
Spatial Alignment of Multimodal Experiments
The first barrier to actionable tissue analysis is representation. It is often advantageous to profile serial tissue sections with different assays, but each section undergoes deformation and measures a distinct, non-overlapping set of analytes.
SAME aligns heterogeneous spatial data flexibly. It uses histology and cell-type cues to anchor correspondence across assays and introduces controlled “space-tearing” transforms that allow limited local deformation while preserving global tissue structure. SAME unifies protein, RNA, and metabolite data in a single framework, enabling analyses such as identifying hidden immune subpopulations and linking metabolic pathways to structural niches.
Under revision
BEELINE
Benchmarking Gene Regulatory Network Inference
During my graduate work, I developed BEELINE, a comprehensive benchmark published in Nature Methods that exposed critical limitations of popular unsupervised GRN methods. BEELINE showed many existing methods performed only marginally better than random against ground truth.
Backed by rigorously curated gold standards, highly reproducible pipelines, and reusable code, BEELINE has been cited in nearly 800 works and serves as a community standard for fair GRN evaluation.