Research

The Center for Predictive Computational Phenotyping is focused on significantly advancing the state of the art in computational methods for transforming large, heterogeneous, high-dimensional data sources into predictive models for biomedicine. Specifically, we are focusing on a broad range of problems that can be cast as computational phenotyping. CPCP is organized into projects which are focused on model problems for computational phenotyping, labs which are developing innovative methodological approaches, and two key cores.

EHR-based Phenotyping Project

Neuroimage-based Phenotyping Project

Epigenome-based Phenotyping Project

Transcriptome-based Phenotyping Project

Phenotype Models for Breast Cancer Screening Project

Stochastic Modeling Lab

Low-dimensional Representations Lab

Data Management Lab

Value of Information Lab

Software Engineering and High-Throughput Computing Core

Bioethics Core

Recent CPCP Publications

Pharmacovigilance via baseline regularization with large-scale longitudinal observational data. Kuang Z, Peissig P, Santos Costa V, Maclin R, Page D. Proceedings of the 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2017

A review of active learning approaches to experimental design for uncovering biological networks. Sverchkov Y, Craven M. PLoS Computational Biology, 2017

MetaSRA: normalized human sample-specific metadata for the Sequence Read Archive. Bernstein M, Doan A, Dewey C. Bioinformatics, 2017

Falcon: Scaling up hands-off crowdsourced entity matching to build cloud services. Das S, Suganthan P, Doan A, Naughton J, Krishnan G, Deep R, Arcaute E, Raghavendra V, Park Y. Proceedings of the ACM International Conference on Management of Data (SIGMOD), 2017

Chromatin module inference on cellular trajectories identifies key transition points and poised epigenetic states in diverse developmental processes. Roy S, Sridharan R. Genome Research, 2017

SCnorm: robust normalization of single-cell RNA-seq data. Bacher R, Chu LF, Leng N, Gasch A, Thomson J, Stewart R, Newton M, Kendziorski C. Nature Methods, 2017

Screening breast MRI outcomes in routine clinical practice: comparison to BI-RADS benchmarks. Strigel RM, Rollenhagen J, Burnside ES, Elezaby M, Fowler AM, Kelcz F, Salkowski L, DeMartini WB. Academic Radiology 24(4):411-417, 2017

Towards interactive debugging of rule-based entity matching. Panahi F, Wu W, Doan A, Naughton J. Proceedings of the International Conference on Extending Database Technology (EDBT), 2017

Ava: From data to insights through conversation. John RJL, Potti N, Patel J. Proceedings of the Conference on Innovative Data Systems Research (CIDR), 2017

Structure-leveraged methods in breast cancer risk prediction. Fan J, Wu Y, Yuan M, Page D, Liu J, Ong IM, Peissig P, Burnside E. Journal of Machine Learning Research 17:1-15, 2016