Instructor: Prof. Megan Owen (she/her)
E-mail: megan.owen@lehman.cuny.edu
Phone: 718-960-7423
Office hours: Tuesdays 3:30 - 4:15pm and 6 - 6:30pm, Thursdays: 3 - 4:15pm, Gillet 137E or Gillet 231
Course time:Tuesday and Thursday, 4:15-5:55pm, Gillet 231

Syllabus

Textbooks:

R and RStudio:

We will use the R statistical computing language, run in RStudio, to analyze data.

Project

MAT 782 Presentation

Outline:

Date: Topics: Reading: Code and data from class, and visualizations: Homework and project deadlines:
#1
Tues 28 January
Syllabus, academic integrity code;
types of data, sample spaces, discrete random variables;
intro to RStudio: importing CSV files, math, line plots
Syllabus
Academic integrity policy

Intro to Probability Theory: Lesson 1, sections 1-4
Intro to Probability Theory: Lesson 7, section 1

Math in R
Importing data into RStudio
Line plot: add type = "l" as a parameter to code for a scatter plot
nycHistPop.csv
code from class
#2
Thurs 30 January
Histograms, barplots, probability mass functions, empirical distribution Intro to Probability Theory: Lesson 1, section 5
Intro to Probability Theory: Lesson 7, sections 1-2

R: histograms
R: barplots
Yellow taxi data: jan30_2019_yellow_taxi.csv (from NYC OpenData)
code from class
#3
Tues 4 February
Measures of center, expectation of a random variable Intro to Probability Theory: Lesson 8, sections 1,3,5
Online Stat Book: Chapter III, section 4 (median and mode)
Online Stat Book: Chapter III, section 10, part 5 (trimmed mean)

R: mean, median
Washington CD bikeshare data: hour.csv (originally download from UCI Machine Learning Repository and modified)
code from class
Homework 1
#4
Thurs 6 February
Measures of spread, order statistics, variance of a random variable Intro to Probability Theory: Lesson 8, sections 4, 5
Intro to Probability Theory: Lesson 13, section 3
Online Stat Book: Chapter III, section 13 (range, inter-quartile range)
R: Variance, standard deviation, IQR, min, max, range
NYC subway variability
code from class
Homework 2
#5
Tues 11 February
Box plots, shape of distributions, continuous random variables, probability density function Intro to Probability Theory: Lesson 13, sections 4-5
Intro to Probability Theory: Lesson 14, section 1


R: boxplot
GitHub

code from class
Homework 3
Milestone 1 (dataset) in class
#6
Thurs 13 February
Bernoulli and binomial random variables
Using GitHub
Cleaning data: renaming columns, missing values
Intro to Probability Theory: Lesson 10, sections 1-2, 4-5
GitHub basics tutorial
R: renaming columns
R: computing binomial distribution density
Visualizing the binomial distribution

Data: starbucks drinks
Data: babies.data

code from class
Homework 4
Milestone 2 (GitHub account) on Blackboard
#7
Tues 18 February
Normal distribution, Central Limit Theorem, probability in R Intro to Probability Theory: Lesson 16, sections 1-2
Intro to Probability Theory: Lesson 27

R: distributions
R: Testing the Central Limit Theorem
Visualization (interactive): Normal distribution
Visualization (interactive): Central Limit Thorem
Visualization: Central Limit Theorem

code from class
Homework 5
#8
Thurs 20 February
Scatterplots, joint probability distribution, correlation Intro to Probability: Lesson 17, sections 1-2
Intro to Probability: Lesson 27
Online Stats book: IV Describing Bivariate Data, sections A-E

R: Scatterplots
Correlation guessing game

code from class
Homework 6
Milestone 3 (webpage and data description) in class
#9
Tues 25 February
Likelihood and maximum likelihood estimation Intro Mathematical Statistics: Lesson 29, sections 1-2 Homework 7
#10
Thurs 27 February
MLE continued and unbiased estimation Intro to Probability: Lesson 24, section 3
Intro Mathematical Statistics: Lesson 29, sections 2-3
Homework 8
Milestone 4 (distribuion plots) in class
#11
Tues 3 March
Subsets in R and Midterm 1 review subset() function
subset() function
Examples of the subset() function
code from class Homework 9
#12
Thurs 5 March
Midterm 1
#13
Tues 10 March
Confidence intervals for one mean Intro Mathematical Statistics: Lesson 30, sections 1-3 and 5-6 t-distribution

code from class
#14
Thurs 12 March
Confidence interval for two means, variances, and proportions Intro Mathematical Statistics: Lesson 31
Intro Mathematical Statistics: Lesson 32
Intro Mathematical Statistics: Lesson 33
Homework 10
If applicable: Milestone 5 (missing data and outliers) in class
#15
Tues 17 March
Simple linear regression Intro Mathematical Statistics: Lesson 35

R: Simple linear regression
Homework 11 (Day 13)
#16
Thurs 19 March
Regression continued Intro Mathematical Statistics: Lesson 36

R: confidence intervals for regression
R: Multiple linear regression
Homework 12 (Day 14)
Milestone 6 (measures of center and spread) in class
#17
Tues 24 March
Introduction to hypothesis testing; type 1 and 2 errors; p-values; tests about one proportion Intro Mathematical Statistics: Lesson 37, sections 1-3
R: One proportion test
Homework 13 (Day 15)
#18
Thurs 26 March
Hypothesis tests about two proportions, tests about one mean Intro Mathematical Statistics: Lesson 37, section 4
Intro Mathematical Statistics: Lesson 38

Two Proportion Z-Test in R
R: One sample t-test
Homework 14 (Day 16)
Milestone 7 (scatterplots and correlation) in class
#19
Tues 31 March
Hypothesis testing for two means Intro Mathematical Statistics: Lesson 39, sections 1-2
R: Two sample t-test
Homework 15 (Day 17)
Wed 1 April Last day to withdraw from class with a grade of W
#20
Thurs 2 April
Midterm 2 review Homework 16 (Day 18)
Milestone 8 (confidence intervals)
Tues 7 April Wednesday schedule: No MAT 327/782 class
8-16 April Spring recess: no classes
#21
Tues 21 April
Midterm 2
#22
Thurs 23 April
Analysis of variance (ANOVA) Intro Mathematical Statistics, Lesson 41
R: ANOVA
Visualizing ANOVA Milestone 9 (linear regression) in class
#23
Tues 28 April
Analysis of variance (ANOVA) continued Intro Mathematical Statistics, Lesson 41
R: ANOVA
Homework 17 (Day 19)
#24
Thurs 30 April
Chi-squared goodness-of-fit test Intro Mathematical Statistics, Lesson 44, sections 1-4 Homework 18 (Day 22)
Milestone 10 (hypothesis test) in class
#25
Tues 5 May
Contingency tables Intro Mathematical Statistics, Lesson 45 Homework 19 (Day 23)
#26
Thurs 7 May
Bayesian statistics Intro Mathematical Statistics, Lesson 52 Homework 20 (Day 24)
Milestone 11 (your choice) in class
#27
Tues 12 May
Project presentations, Bayesian satistics cont'd Homework 21 (Day 25)
Project slides due at 10am
#28
Thurs 14 May
Review for final exam
Tues 19 May Final exam 3:45pm - 5:45pm