Recent & Upcoming Events

Oct 18, 2016

Big Privacy: Policy Meets Data Science Symposium


Jun 30, 2016

CPCP Second Annual Retreat

A day-long retreat highlighting recent research in the Center; featuring talks, posters, and lunch.

Apr 21, 2016

CPCP Seminar: Mining Structures from Massive Bio-Text Data: A Data-Driven Approach by Dr. Jiawei Han

Jiawei Han from the BD2K KnowEng Center-UIUC discussed mining structures from massive bio-text data.

Nov 10, 2015

CPCP Seminar: Transforming Your Research with High-Throughput Computing by Lauren Michael

Lauren Michael from the CHTC discussed high-throughput computing approaches to Big Data.

Oct 15, 2015

Big Privacy: Policy Meets Data Science Symposium

A symposium on the legal, policy, & technical issues at the intersection of privacy and data science

Training Resources

CPCP Seminar: Transforming Your Research Through High Throughput Computing Seminar Video

Presented by Lauren Michael

Big Privacy Symposium: Introductory and Welcoming Remarks Symposium Video

Presented by David Page, PhD

Big Privacy Symposium: Big Data, Big Headaches: Cultivating Public Trust in an Age of Unconsented Access to Identifiable Data Symposium Video

Presented by Barbara J. Evans PhD, JD, LLM

Big Privacy Symposium: Does Publishing a Predictive Model for Precision Medicine Put Patient Privacy at Risk? Symposium Video

Presented by Matt Fredrikson, PhD

Big Privacy Symposium: Panel Discussion Symposium Video

Panel Members: Barbara Evans, Matt Fredrikson, Arvind Narayanan, Pilar Ossorio, Vitaly Shmatikov

Recent Publications

Computational drug repositioning using continuous self-controlled case series. Kuang Z, Thomson J, Caldwell M, Peissig P, Stewart R, Page D. Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2016

Latent variable graphical model selection using harmonic analysis: applications to the Human Connectome Project (HCP). Kim WH, Kim HJ, Adluru N, Singh V. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

Experimental design on a budget for sparse linear models and applications. Ravi SN, Ithapu VK, Johnson SC, Singh V. Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016

Coupled harmonic bases for longitudinal characterization of brain networks. Hwang SJ, Adluru N, Collins MD, Ravi SN, Bendlin BB, Johnson SC, Singh V. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

A hierarchical framework for state space matrix inference and clustering. Zuo C, Chen K, Hewitt K, Bresnick EH, Keles S. Annals of Applied Statistics, 2016

Recent Resources

EBSeqHMM software

EBSeqHMM is an R package that implements an auto-regressive hidden Markov model for identifying genes and isoforms that have expression changes in ordered RNA-seq experiments, and clustering the identified genes into paths showing similar changes. EBSeqHMM is suitable for any ordered RNA-seq experiment including time courses and spatially ordered experiments.

Oscope software

Oscope is a statistical pipeline for identifying oscillatory genes and characterizing one cycle of their oscillation, referred to as a base-cycle, in unsynchronized snapshot single cell RNA-seq experiments. The Oscope pipeline includes three modules: a paired-sine model module to identify candidate oscillator pairs; a clustering module to cluster candidate oscillators into groups; and an extended nearest insertion module to estimate the base-cycle oscillation within each group.

OEFinder software

OEFinder is an R package that allows an investigator to identify genes having the so-called ordering effect in single-cell RNA-seq data generated by the Fluidigm C1 platform. This effect (Leng et al., Nature Methods, 2015) refers to significantly increased gene expression in cells captured from sites with small or large plate output IDs.

Rolemodel software

The role model is a probability model used in the context of gene set analysis to describe the functional content of a user-supplied gene list, such as one derived from a genome-wide experiment. It integrates the list with gene sets from a knowledge base (e.g. Gene Ontology) and aims to summarize gene functions that are represented at an unusually high rate on the list. Compared to other gene-set enrichment analysis schemes, role model calculations contend better with the complexity of the knowledge base, including redundancies caused by overlapping sets and the effects of set-size variation.

GADGET software

GADGET is a web tool that for finding and ranking genes and metabolites that are associated with a given query in the biomedical literature. It's like a version of PubMed that returns genes and metabolites instead of articles.