Software

Research Software

ssbc

PAC-Conformal Prediction with Small Sample Guarantees

Python package implementing Small Sample Beta Correction for conformal prediction. Provides tight PAC coverage guarantees even in small-sample regimes, enabling accountable automation for scientific applications.

Features:

GitHub Repository


dlsia

Deep Learning for Scientific Image Analysis

Comprehensive ML toolkit for scientific imaging applications including segmentation, anomaly detection, and tensor processing across multiple modalities.

Applications:

Key capabilities:

GitHub Repository


qlty

Out-of-Core Tensor Management

Framework for efficient ML training and inference on memory-constrained hardware. Enables processing of large-scale imaging datasets that exceed available RAM.

Features:

GitHub Repository


Major Infrastructure Contributions

PHENIX

Macromolecular Structure Determination Suite

Core developer and contributor to PHENIX, a comprehensive system for automated determination of macromolecular structures using X-ray crystallography and other methods.

My contributions:

Impact: Used by thousands of structural biologists worldwide; cited >10,000 times

PHENIX Website
Adams et al. (2010) Acta Cryst. D


CCTBX

Computational Crystallography Toolbox

Foundational crystallographic infrastructure providing core algorithms and data structures for crystallographic computing.

Contributions:

Impact: Forms the computational backbone for PHENIX and numerous other crystallographic software packages

GitHub Repository


Xtriage

Crystallographic Data Quality Control

Automated quality control system for crystallographic data, detecting common pathologies and data collection problems.

Features:

Adoption: Integrated into the Protein Data Bank validation pipeline, checking all deposited structures worldwide


Domain-Specific Tools

SAXS Analysis Pipeline

Small-Angle X-ray Scattering

Developed automated analysis pipeline achieving 10× speedup for SAXS data processing at the Advanced Light Source.

Capabilities:


FEL Workflows

Free-Electron Laser Data Processing

Exascale computational workflows for serial femtosecond crystallography and other FEL techniques.

Features:


Development Practices

All research software follows:

Collaborative Development

I actively collaborate on software development with:


Technical Stack

Languages: Python, C++
ML Frameworks: PyTorch, NumPy, SciPy, scikit-learn, scikit-image
Computing: HPC/Exascale systems, CUDA, distributed computing
Tools: Git, GitHub, CI/CD, Docker, Jupyter
Domains: Scientific imaging, structural biology, FEL science, spectroscopy


For publications describing these tools and methods, see the Publications page.