Date: | Topics: | Lab and Handouts: | Reading: | Classwork & Quiz Topics: | |
#1
Tues 27 August |
Syllabus; What is Data Science; Introduction to Python (math, variables, and printing); line plots with Pandas | Syllabus Citi Bike data example Data Science Process Lab 1 - Introduction to Python and Pandas (Jupyter notebook) nycHistPop.csv |
Academic Integrity Policy, 3.1,3.2,3.3 Variables Line graphs |
Online quiz: Academic Integrity | |
#2
Thurs 29 August |
Statistical varaibles; proportions; column operations | Lab 2 - Plotting NYC's shelter population | Statistical variables | Classwork: Statistical variables | |
Mon 2 September | CUNY: No classes (Labor Day) | ||||
#3
Tues 3 Sept |
Bar charts | Lab 3 - Bar charts Sept3_2018_Green_Taxi_Trip_Data.csv |
7.1 | Online quiz: variables and functions (all of Lab 1 except the plotting section) | |
Thurs 5 Sept | Classes follow a Monday schedule | ||||
#4
Tues 10 Sept |
Histograms | Lab 4 - Histograms | 7.2, Histograms | Online quiz: Lab 1, statistical variables | |
#5
Thurs 12 Sept |
Mean, median, and mode; filtering | Lab 5 - Mean, Median, and Mode |
Online Stats: Median and mean Non-technical overview: mean, median, mode |
Online quiz: Lab 2 | |
#6
Tues 17 Sept |
Measures of Spread: range, variance, and standard deviation | Lab 6 - Measures of spread (Range, Variance, and Standard Deviation) | Measures of Variability Subway trip variability |
Classwork: mean, median, and variance | |
#7
Thurs 19 Sept |
Behavior of sample vs. population; boxplots | Lab 7 - Samples and boxplots | 10.2 Sampling from a Population, percentiles, boxplots | Online quiz: Labs 3 and 4 | |
#8
Tues 24 Sept |
Introduction to probability, computing probabilities, filtering | Lab 8 - Computing probabilities | 9.5 Finding Probabilities,
Introduction to Probability, Computing probabilities Filtering |
Paper quiz: Labs 1 and 2 (assigments 1-4), statistical variables; 1 sheet of paper (8 1/2" x 11") with handwritten notes on both sides is allowed Sample quiz |
|
#9
Thurs 26 Sept |
Filtering | Lab 9 - Filtering imdb_1000.csv |
Filtering with Pandas | Classwork | |
30 September - 1 October | CUNY: No classes | ||||
#10
Thurs 3 October |
Computing probabilities with and/or, subsets of dataframes |
Lab 10 - Computing probabilities 2- Sept17_2019_311_Service_Requests.csv |
9.5 Finding Probabilities | Online quiz: review, Labs 5 and 6 | |
8-9 October | CUNY: No classes | ||||
#11
Thurs 10 October |
Iteration, Sampling and Empirical Distributions | Lab 11 - Iteration and Sampling Distributions | 10.3 Empirical Distribution of a Statistic 9.3 Iteration Iteration with turtles |
Paper quiz: Labs 3 and 4 (assigments 5-8); 1 sheet of paper (8 1/2" x 11") with handwritten notes on both sides is allowed Sample quiz |
|
#12
Tues 15 October |
Comparing distributions visually, data and time in pandas | Lab 12 - Comparing distributions visually | Online quiz: review, Labs 7, 8, and 9 | ||
#13
Thurs 17 October |
Simulations and hypotheses | Lab 13 - Simulations and hypotheses | 11.1 Assessing Models Introduction to Hypothesis Test |
Classwork: Introduction to hypotheses | |
#14
Tues 22 October |
Hypothesis testing of proportions | Lab 14 - Hypothesis testing of proportions |
11.1 Assessing Models Introduction to Hypothesis Test |
Paper quiz: Labs 5, 6, and 7 (assigments 9-14); 1 sheet of paper (8 1/2" x 11") with handwritten notes on both sides is allowed Sample quiz |
|
#15
Thurs 24 October |
Hypothesis testing of proportions continued | Lab 15 - Hypothesis testing of proportions continued | Online quiz: Review, Labs 10, 11, 12 | ||
#16
Tues 29 October |
Bootstrap and confidence intervals | Lab 16 - Bootstrap and confidence intervals | 13.1 Percentiles 13.2 The Bootstrap 13.3 Confidence Intervals Much more detail about the dataset |
Online quiz: Review and Labs 13 and 14 | |
#17
Thurs 31 October |
Normal distributions | Lab 17 - Normal distributions | 14.3 The SD and the Normal Distribution Online stats book: normal distributions Visualizing the normal distribution |
Paper quiz: Labs 8, 9, 10 (assigments 15-20); 1 sheet of paper (8 1/2" x 11") with handwritten notes on both sides is allowed Sample quiz |
|
5 November | Last day to withdraw from class with a grade of W | ||||
#18
Tues 5 November |
Central Limit Theorem | Lab 18 - Central Limit Theorem Data: starbucks-menu-nutrition-drinks.csv |
14.4 The Cental Limit Theorem 14.5 The Variability of the Sample Mean Visualization of the Central Limit Theorem |
Classwork: Probabilities | |
#19
Thurs 7 November |
Functions and conditional statements | Lab 19 - Functions and conditional statements | 8.1 Applying a function to a column 9.1 Conditional statements |
Online quiz: Review, Labs 15 and 16 | |
#20
Tues 12 November |
Correlation, Causation, and Heat maps | Lab 20 - Correlation, causation, and heat maps Data: Feb2019_labor_market_majors.csv |
Spurious correlations Correlation guessing game 15.1 Correlation Online stats book: intro to correlation |
Paper quiz: Labs 11 and 12 (assignments 21 - 24); 1 sheet of paper (8 1/2" x 11") with handwritten notes on both sides is allowed Sample quiz (note that labs 10 and 11 in the sample quiz are labs 11 and 12 this term) |
|
#21
Thurs 14 November |
Simple linear regression, checking residuals for normality | Lab 21 - Simple linear regression Data: Feb2019_labor_market_majors.csv |
15.2 The Regression Line 15.5 Visual Diagnostics Introduction to linear regression Visual explanation of linear regression |
Classwork: linear regression | |
#22
Tues 19 November |
Multi-linear regression, R-squared, and prediction | Lab 22 - Multi-linear regression, R-squared, and prediction | 15.4 Least Squares Regression 17.6 Multiple Regression |
Online quiz: Labs 17 and 18 | |
#23
Thurs 21 November |
Confidence intervals for the slope of linear regression | Lab 23 - Confidence intervals for regression | 16.1 A regression model, 16.2 Inference for the true slope | Paper quiz: Labs 13, 14, and 15 (assignments 25 - 30); 1 sheet of paper (8 1/2" x 11") with handwritten notes on both sides is allowed Sample quiz |
|
#24
Tues 26 November |
Intro to Machine Learning: understanding the data | Lab 24 - Understanding the Titantic data Kaggle: Titanic: Machine Learning from Disaster train.csv test.csv |
Classwork: Understanding the Titanic data | ||
28 November - 1 December | Thanksgiving Recess: College Closed | ||||
#25
Tues 3 December |
k-nearest neighbors (machine learning) | Lab 25 - k-Nearest Neighbors classifier 1 | 17 Classification 17.1 Nearest Neighbors 17.2 Training and Testing KNN classification using Scikit-Learn (Datacamp) |
Paper quiz: Labs 16 (bootstrap and confidence interval), 17 (normal distribution), 18 (Central Limit Theorem), 19 (functions), 20 (correlation and heatmaps) Sample quiz |
|
#26
Thurs 5 December |
k-nearest neighbors continued (machine learning) | Lab 26 - k-Nearest Neighbors classifier 2 | |||
#27
Tues 10 December |
The data science process revisited | Lab 27 - The data science process revisited | Paper quiz: Labs 21, 22, 23 (linear regresesion) Sample quiz |
||
#28
Thurs 12 December |
Review | Sample final Spring 2019(answers) Final Spring 2019(answers) |
|||
Thurs 19 December | Final exam 1:30pm - 3:30pm, Gillet 231 |