This is the website for the Introduction to Data Science (MATH08077) course offered at the University of Edinburgh for the academic year 2024/5.
Learn to explore, visualize, and analyse data to understand natural phenomena, investigate patterns, model outcomes, and make predictions, and do so in a reproducible and shareable manner. Gain experience in data collection, wrangling, and visualization, exploratory data analysis, predictive modelling, and effective communication of results while working on problems and case studies inspired by and based on real-world questions. The course will focus on the R statistical computing language. No statistical or computing background is necessary. Additional official course information can be found here.
Week 1: Get acquainted with the course, the technology, the workflow, and the skills you will acquire throughout the semester.
Week 2: Data wrangling, joining, and tidying.
Week 3: Importing data, data types and classes, recoding.
Week 4: Uncertainty quantification and hypothesis testing with bootstrap.
Week 5: Tips for effective communication of results, and collaboration. Intro to Data Visualization.
Week 6: Data visualization (part 2), interpretation of graphical information and tips for effective data visualization.
Week 7: Misrepresentation of findings, data privacy, and algorithmic bias.
Week 8: Linear models for predicting numerical data from single and multiple variables.
Week 9: Logistic regression for predicting categorical data and model building.
Week 10: Evaluating models with cross validation and further topics in modeling.
Week 11: Additional topics beyond IDS
Information on the various components of the course.
Information on the assessments for the course.
This online work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International licence (visit here for more information). These materials have been adapted from Data Science in a Box by Mine Çetinkaya-Rundel, which is under the same licence.