Saturday, September 10, 2016

Population Health Data Science with R

Transforming data into actionable knowledge

I am writing this book to introduce R—a language and environment for statistical computing and graphics—for health data analysts conducting population health studies. From my experience in public health practice, sometimes even formally trained epidemiologists lack the breadth of analytic skills required at health departments where resources are very limited. Recent graduates come prepared with a solid foundation in epidemiological and statistical concepts and principles and they are ready to run a multivariable analysis (which is not a bad thing we are grateful for highly trained staff). However, what is sometimes lacking is the practical knowledge, skills, and abilities to collect and process data from multiple sources (e.g., Census data; reportable diseases, death and birth registries) and to adequately implement new methods they did not learn in school. One approach to implementing new methods is to look for the “commands” among their favorite statistical packages (or to buy a new software program). If the commands do not exist, then the method may not be implemented. In a sense, they are looking for a custom-made solution that makes their work quick and easy.

Sunday, September 4, 2016

Applied Epidemiology Using R, 2016

Public Health 215D, 2016, fall

UC Berkeley School of Public Health
Division of Epidemiology
Mondays 4pm--6pm, Valley Life Sciences 2030
Berkeley Academic Calendar:

Course description

This is an intensive one-semester introduction to the R programming language for applied epidemiology. This year we will be experimenting with a population health data science perspective. Population health is a systems framework for studying and improving the health of a population through collective action and learning. Data science is the art and science of transforming data into actionable knowledge. Population health data science is the art and science of transforming public health health data into actionable knowledge to improve population health. The key words are actionable knowledge. Traditionally, epidemiology has focused primarily on descriptive and explanatory (causal) methods. Data science extends this to include exploratory, predictive, and prescriptive methods.

The core of population health data science is the timely analysis and synthesis of data using programming and computing power. Fortunately for us we have R! R is a freely available, multi-platform (Linux, Mac OS, Windows, etc.), versatile, and powerful program for statistical computing and graphics ( This course will focus on core basics of organizing, managing, and manipulating population health data; basic population health applications; introduction to R programming; and basic R graphics. Students will complete and present a project in their field of interest.

Course book

Population Health Data Science

This book is early in its development and feedback is welcome and appreciated.