Welcome to CSCI 9
These course notes were compiled and created for CSCI 9: Practical Data Science at El Camino College. The course is designed to prepare you for advanced work in data management, machine learning, and statistics, and to get you working with real-world data using industry-standard tools. You will use pandas for DataFrames and tabular data, matplotlib, seaborn, and plotly for visualization, scikit-learn for modeling (e.g., linear regression, decision trees), and SQL for relational databases. We go from working with tabular data and visualizing it to sampling, linear regression, and SQL. The course builds on CSCI 8 (where you may have used the datascience library, tables, and basic modeling) and emphasizes the full data science lifecycle: not memorizing syntax, but learning to look up documentation, think critically about data, and communicate findings.
How to use this textbook¶
The live textbook is available at https://
Launch and run code¶
Run in Binder or your own hub¶
Other actions¶
Acknowledgments¶
CSCI 9 draws on materials from two courses: Data 100 at UC Berkeley (Principles and Techniques of Data Science) and DSC 80 at UC San Diego (Practice and Application of Data Science). Our conceptual framework and lifecycle follow the same ideas; you can use ds100.org and dsc80.com as references alongside these notes.