Recent & Upcoming Events

Sep 26, 2017

CPCP Privacy/Fairness Seminar: The Bounty of the Commons by Dr. Casey Greene

Casey Greene on technical approaches to challenges in data sharing

Jun 1, 2017

CPCP Third Annual Retreat

The day-long program will feature presentations about using Big Data to improve human health.

Apr 1, 2017

CPCP at UW-Madison Science Expeditions

Come explore how the CPCP is using Big Data to improve human health.

Nov 22, 2016

CPCP Seminar: Big Data in Behavioral Medicine by Dr. James Rehg

James Rehg on understanding and developing interventions for adverse health-related behaviors

Oct 18, 2016

Big Privacy: Policy Meets Data Science Symposium

This symposium examined the legal, policy, and technical issues at the intersection of data privacy

Training Resources

CPCP Seminar: Big Data in Behavioral Medicine Seminar Video

Complex chronic diseases are creating a growing burden on society. This burden affects the quality of life for many individuals in addition to the financial burden associated with treatment. Every year a large percentage of deaths in the United States are caused by poor diet, physical inactivity or substance abuse (primarily tobacco). These problems are fundamentally behavioral in nature. In addition, developmental disorders such as autism are diagnosed by qualification of behaviors. Dr. James Rehg talks about the role of Big Data on these types of behavioral health disorders. Dr. Rehg and his coleagues work with the new types of sensors that are becoming increasingly available to measure behavioral patterns. They have developed a number of computational models to improve the analysis of these measurements. These models allow them to make quantitative statements about what types of therapies have the greatest affects on behavior.

CPCP Privacy Symposium 2016: Privacy Preserving Federated Biomedical Data Analysis Symposium Video

Learn about the challenges associated with the technical approaches for utilizing data from multiple sources to build more accurate machine learning algorithms from Dr. Xiaoquian Jiang. We know that having more types of data and data from distributed sources provides a stronger platform for research and discovering with machine learning. To address privacy in this context, Dr. Jiang proposes a privacy-preserving distributed data framework and describes various models implemented to solve the such problems. Dr. Jiang's research group has produced versions of this framework in R and Java as well as an online web-service. All version of this framework are available for other researchers to use for their own analysis.

CPCP Privacy Symposium 2016: Privacy is an Essentially Contested Concept Symposium Video

What does privacy mean in the context of Big Data? Dr. Deirdre Mulligan discusses various definitions of privacy in law, philosophy and computer science. Traditional approaches to privacy in data place most of the responsibility for the control of private information flow on the individual with mechanisms such as consent. This idea, known as informational actualization, has limitations that have been exposed by machine learning on big data. These limitations cause violations of privacy such as uncovering identity of individuals where it has been withheld or unexpected inferences made from data that have been intentionally disclosed. Dr. Mulligan suggests new ways of viewing privacy that evolve as social life and technology change.

CPCP Privacy Symposium 2016: Proving that Programs Do Not Discriminate Symposium Video

As the field of Artificial Intelligence (AI) continues to advance, an increasing number of prediction are made by computer programs about humans. These predictions affect decisions made about humans in a wide variety for areas including decisions about: who should get the job, the bank loan, or early release from prison. As we increasingly rely on AI programs to help make decisions about peoples lives, it becomes vitally important that we are able to ensure the programs we are depending on do not have an unfairly biased against certain groups of people. Dr. Aws Albarghouthi of the University of Wisconsin - Madison Computer Sciences department uses his expertise in programming languages to address this issue of fairness.

CPCP Privacy Symposium 2016: Panel Discussion Symposium Video

Dr. Pilar Ossorio from Morgridge Institute for Research at the University of Wisconsin-Madison, and Dr. Peggy Peissig from the Biomedical Informatics Research Foundation join Dr. Aws Albarghouthi, Dr. Deirdre Mulligan, and Dr. Xiaoquian Jiang to answer questions from the audience about privacy and fairness in the context of computational analysis.

Recent Publications

Anxiety-related experience-dependent white matter structural differences in adolescence: A monozygotic twin difference approach. Adluru N, Luo Z, VanHulle CA, Schoen AJ, Davidso, RJ, Alexander AL, Goldsmith HH. Scientific Reports, 7(1): 8749, 2017

Machine learning consensus scoring improves performance across targets in structure-based virtual screening. Ericksen S, Wu H, Zhang H, Michael L, Newton M, Hoffmann FM, Wildman S. Journal of Chemical Information and Modeling 57(7):1579–1590, 2017

Pharmacovigilance via baseline regularization with large-scale longitudinal observational data. Kuang Z, Peissig P, Santos Costa V, Maclin R, Page D. Proceedings of the 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2017

A review of active learning approaches to experimental design for uncovering biological networks. Sverchkov Y, Craven M. PLoS Computational Biology, 2017

MetaSRA: normalized human sample-specific metadata for the Sequence Read Archive. Bernstein M, Doan A, Dewey C. Bioinformatics, 2017

Recent Resources

CMINT software

Chromatin Module INference on Trees (CMINT) is an algorithm for learning chromatin modules, defined as groups of genomic loci that have similar chromatin states. Chromatin states in turn are defined by a combination of chromatin mark profiles.

scDD software

scDD is an R package to identify genes with distributional changes across conditions in a single-cell RNA-seq experiment

scPattern software

scPattern is an R package to identify and classify gene expression changes in ordered single-cell RNA-seq experiments

RIPPLE software

Regulatory interaction prediction for promoters and long-range enhancers

MetaSRA: normalized metadata for the Sequence Read Archive data

MetaSRA is an annotation/re-coding of sample-specific metadata in the Sequence Read Archive using biomedical ontologies. Currently, MetaSRA maps biological samples to biologically relevent terms in the Disease Ontology, Experimental Factor Ontology, Cell Ontology, Uberon, and Cellosaurus.