
Hi! I am a tenure-track Assistant Professor in the Department of Biomedical Informatics & Data Science (BIDS) at Yale University School of Medicine. Our research lies at the interface of machine learning, genomics, and precision medicine. Our long-term goal is to build machine learning systems to assist scientific discovery, clinical decision making, and personal health management. The focus of our ongoing research is the development of machine learning algorithms (e.g., deep learning and probabilistic graphical models) which exploit massive genetic, multiomic, and clinical data to uncover the genomic basis of complex human diseases. Specifically, our work follows a variant-gene-pathway principle where we start from deep learning modeling of biological sequences (e.g., DNA and RNA) to predict functional effects of variants in different cellular processes (i.e., in silico mutagenesis; NAR 2016, Cell Systems 2017, Bioinformatics 2017). We then move on to a global modeling of genotype-phenotype mapping where we identify candidate risk genes (Neuron 2022, Cell Systems 2022) and predict phenotypes from personal genomes (Cell 2018). By leveraging cutting-edge techniques (e.g., deep learning and single-cell genomics), we are particularly interested in modeling the complexity (e.g., nonlinearity and cell-type-specificity) of the underlying biological system (Cell 2019).
Sai serves on the program committee of RECOMB 2026.
10/18/2025Sai serves as an Area Chair for ISMB 2026.
10/01/2025Zhang Lab is officially launched at Yale BIDS!
09/30/2025Sai serves as a Guest Editor of PLOS Computational Biology.
06/04/2025Sai serves as an Area Chair for ACM BCB 2025.
05/28/2025scPRS is accepted by Nature Biotechnology.
04/18/2025We received MIRA from NIGMS to support our research in single-cell genetics over the next five years!
04/16/2025Check out our latest study on the genetics of ME/CFS, in which we designed HEAL2 – a new deep learning model that predicts disease risk from rare coding variants.
02/21/2025Check out Prophet – a deep learning model for disease diagnosis and prediction based on personal plasma proteomes.