UC Berkeley Data Science Modules

Data Science for the Social Sciences

It is no secret in industry that data has impacted every facet of human life. In the modern age, researchers are now adopting new techniques to leverage data science and statistics to their unique disciplines. At the Data Science Education Program (DSEP) at UC Berkeley, we believe in providing opportunities to students to be exposed to this paradigm shift.

Feel free to take a look at the wealth of curriculum resources on the left of the screen.

Data Science at Berkeley

In the Fall of 2018, the UC Berkeley Division of Data Sciences and Information released the official Bachelor of Arts in Data Science major. Even before then, many years of planning went into increasing the exposure of the data sciences to the ungraduate community of Cal. From it’s inception in 2017, Data 8: The Foundations of Data Science has become the fastest growing class in Berkeley’s history. From that rising popularity came the construction of many other data science classes, such as the supplemental 2-unit Connector Courses, as well as the liberal arts-focused Modules program.

Data Science Education Programs (DSEP)

The DSEP team at UC Berkeley is an autonomous working arm of the Division of Data Sciences and Information. The team is made up of a variety of professors and undergraduates motivated to improve the way data science is being taught across all college and K-12 education systems. Our mission is to democratize data science pedagogy by integrating statistical analysis lectures and curriculum materials into the social sciences.


This compilation of modules into the textbook format was curated by Alex Nakagawa, Shalini Kunapuli, and Christopher Pyles. Contributors to certain chapters are recognized on each page of this textbook.

Jupyter Books

This website was created using Jupyter Books. Jupyter Books lets you build an online book using a collection of Jupyter Notebooks and Markdown files. Its output is similar to the excellent Bookdown tool, and adds extra functionality for people running a Jupyter stack.

For an example of a book built with Jupyter Books, see the textbook for Data 100 at UC Berkeley.

Here are a few features of Jupyter Books

  • All course content is written in markdown and Jupyter Notebooks, stored in notebooks/
  • The Jupyter Book repo comes packaged with helper scripts to convert these into Jekyll pages (in scripts/) that can be hosted for free on GitHub
  • Pages can have Binder or JupyterHub links automatically added for interactivity.
  • The website itself is based on Jekyll, and is highly extensible and can be freely-hosted on GitHub.
  • There are lots of nifty HTML features under-the-hood, such as Turbolinks fast-navigation and click-to-copy in code cells.