Recent & Upcoming Events

Oct 18, 2016

Big Privacy: Policy Meets Data Science Symposium

SAVE THE DATE!

Jun 30, 2016

CPCP Second Annual Retreat

Apr 21, 2016

CPCP Seminar: Mining Structures from Massive Bio-Text Data: A Data-Driven Approach by Dr. Jiawei Han

Jiawei Han from the BD2K KnowEng Center-UIUC discusses mining structures from massive bio-text data.

Nov 10, 2015

CPCP Seminar: Transforming Your Research with High-Throughput Computing by Lauren Michael

Lauren Michael from the CHTC discussed high-throughput computing approaches to Big Data.

Oct 15, 2015

Big Privacy: Policy Meets Data Science Symposium

A symposium on the legal, policy, & technical issues at the intersection of privacy and data science

Training Resources

CPCP Seminar: Transforming Your Research Through High Throughput Computing Seminar Video

Presented by Lauren Michael

Big Privacy Symposium: Introductory and Welcoming Remarks Symposium Video

Presented by David Page, PhD

Big Privacy Symposium: Big Data, Big Headaches: Cultivating Public Trust in an Age of Unconsented Access to Identifiable Data Symposium Video

Presented by Barbara J. Evans PhD, JD, LLM

Big Privacy Symposium: Does Publishing a Predictive Model for Precision Medicine Put Patient Privacy at Risk? Symposium Video

Presented by Matt Fredrikson, PhD

Big Privacy Symposium: Panel Discussion Symposium Video

Panel Members: Barbara Evans, Matt Fredrikson, Arvind Narayanan, Pilar Ossorio, Vitaly Shmatikov

Recent Publications

A MAD-Bayes algorithm for state-space inference and clustering with application to querying large collections of ChIP-seq data sets. Zuo C, Chen K, Keles S. Proceedings of the 20th Annual International Conference on Research in Computational Molecular Biology (RECOMB), 2016

Distance shrinkage and Euclidean embedding via regularized kernel estimation. Zhang L, Wahba G, Yuan M. Journal of the Royal Statistical Society B, doi:DOI: 10.1111/rssb.12138, 2016

OEFinder: a user interface to identify and visualize ordering effects in single-cell RNA-seq data. Leng N, Choi J, Chu LF, Thomson JA, Kendziorski C, Stewart R. Bioinformatics, 2016

Comparing mammography abnormality features to genetic variants in the prediction of breast cancer in women recommended for breast biopsy. Burnside ES, Liu J, Wu Y, Onitilo AA, McCarty CA, Page CD, Peissig PL, Trentham-Dietz A, Kitchner T, Fan J, Yuan M. Academic Radiology 23(1):62–69, 2016

On statistical analysis of neuroimages with imperfect registration. Kim WH, Ravi SN, Okonkwo OC Johnson SC, Singh V. Proceedings of International Conference on Computer Vision, 2015

Recent Resources

GADGET software

GADGET is a web tool that for finding and ranking genes and metabolites that are associated with a given query in the biomedical literature. It's like a version of PubMed that returns genes and metabolites instead of articles.

rvalues software

rvalues is an R package for computing "r-values" from various kinds of user input such as a list of effect size estimates and associated standard errors. Given a large collection of measurement units, the r-value, r, of a particular unit is a reported percentile that may be interpreted as the smallest percentile at which the unit should be placed in the top r-fraction of units.

atSNP software

atSNP (Affinity Test for regulatory SNP detection) is an R package for computing and testing large-scale motif-SNP interactions. It provides three main functions: (1) Computing the binding affinity scores for both the reference and the SNP alleles based on position weight matrices; (2) Computing the p-values of the affinity scores for each allele; (3) Computing the p-values of the affinity score changes between the reference and the SNP alleles.