Styling and infrastructure for this page inspired by related syllabi produced by Ben Baumer and R. Jordan Crouser.

All readings for this course will be available in our course Perusall, which is linked in Moodle. I encourage you to complete the readings there so that you can leave comments and questions as they come up.

January 24, 2022

What is Data Science?

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

If you wish you may fill out the Trigger Warnings Questionnaire in Moodle.
Be sure to configure Slack notifications for our course. I also encourage you to download and install a Desktop version of Slack.
Class slides are here

January 26, 2022

What are datasets?

Due Today

No Readings

Fill out First Day of Class Questionnaire in Moodle
Review the grading contract and post questions in Perusall
Contact me if you will be using a Chromebook

Optional Further Reading

No Readings

Announcements

Class slides are here
Today’s worksheet is here

January 28, 2022

Lab: Introduction to R

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Trigger warning: Monday’s reading opens with a case study on gun homicide in the U.S.
The CS Liaisons will be hosting a software install session from 2:30-4 today. More information soon.
Today’s lab is posted here.
I will remain on the call for one hour past our scheduled class time to help anyone that would like to finish the lab. Please note that I won’t be able to do this every week.

January 31, 2022

Data Fundamentals

Due Today

2. R Basics , Irizarry, Rafael A. (2022). Introduction to Data Science. Data Analysis and Prediction Algorithms with R. URL: https://rafalab.github.io/dsbook/ (visited on Jan. 14, 2022).

Finish RStudio/GitHub Set-up

Optional Further Reading

No Readings

Announcements

You do not need to follow along with exercises in the course texts but may choose to do so if you wish.
Please thread responses to messages in Slack.
Class slides are here
Today’s worksheet is in Moodle. Download it and then open it in RStudio.

February 02, 2022

Summarizing Data

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Class slides are here
Today’s worksheet is in Moodle. Download it and then open it in RStudio.
Monday’s solutions are here.

February 04, 2022

Lab: Learning to Code

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Today’s and all future labs are here.
Last Wednesday’s solutions are here.

February 07, 2022

Grammar of Graphics

Due Today

2. Data Visualization , Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton (2021). Modern Data Science with R. 2nd. CRC Press. URL: https://mdsr-book.github.io/mdsr2e/ (visited on Jan. 14, 2022).

Optional Further Reading

list()

Announcements

February 09, 2022

Visualization Conventions

Due Today

11. Data Visualization Principles , Irizarry, Rafael A. (2022). Introduction to Data Science. Data Analysis and Prediction Algorithms with R. URL: https://rafalab.github.io/dsbook/ (visited on Jan. 14, 2022).

Optional Further Reading

Tufte, Edward R. (2001). The Visual Display of Quantitative Information. 2nd edition. Cheshire, Conn: Graphics Press. ISBN: 978-1-930824-13-3.
D’Ignazio, Catherine and Lauren Klein (2020). “3. On Rational, Scientific, Objective Viewpoints from Mythical, Imaginary, Impossible Standpoints”. En. In: Data Feminism. Publisher: PubPub. MIT Press. URL: https://data-feminism.mitpress.mit.edu/pub/5evfe9yd/release/1 (visited on Aug. 24, 2021).

Announcements

Class slides are here.
Quiz 1 has been posted and is due next Wednesday.
Monday’s solutions are here.

February 11, 2022

Lab: Designing Effective Data Visualizations

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Today’s and all future labs are here.
The ggplot() cheatsheet is here.
Solutions to Wednesday’s exercises are here.
Remember our data challenge!

February 14, 2022

Frequency Plots

Due Today

2. Data Visualization , Ismay, Chester and Albert Y. Kim (2021). Modern Dive: Statistical Inference via Data Science. CRC Press. URL: https://moderndive.com/ (visited on Jan. 14, 2022).

Optional Further Reading

No Readings

Announcements

Class slides are here
Friday’s solutions are here

February 16, 2022

Boxplots

Due Today

No Readings

Quiz 1

Optional Further Reading

No Readings

Announcements

Class slides are here
Quiz 2 posted today.
Mini-project 1 will be posted on Friday.

February 18, 2022

Lab: Designing Multi-variate Plots

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Day 10 solutions are here.
Day 11 solutions are here.

February 21, 2022

GitHub Essentials

Due Today

Bryan, Jennifer (2018). “Excuse Me, Do You Have a Moment to Talk About Version Control?” In: The American Statistician 72.1. Publisher: Taylor & Francis _ eprint: https://doi.org/10.1080/00031305.2017.1399928, pp. 20-27. DOI: 10.1080/00031305.2017.1399928. URL: https://doi.org/10.1080/00031305.2017.1399928 (visited on Jan. 14, 2022).

Optional Further Reading

Brennan, Stephen (2022). GitHub for Non-Coders - Stephen Brennan. URL: https://brennan.io/2015/08/07/github-noncoders/ (visited on Jan. 14, 2022).

Announcements

Class slides are here

February 23, 2022

Authoring Documents in Rmarkdown

Due Today

D Reproducible analysis and workflow , Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton (2021). Modern Data Science with R. 2nd. CRC Press. URL: https://mdsr-book.github.io/mdsr2e/ (visited on Jan. 14, 2022).

Quiz 2

Optional Further Reading

27. R Markdown , Wickham, Hadley and Garrett Grolemund (2017). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 1st edition. Sebastopol, CA: O’Reilly Media. ISBN: 978-1-4919-1039-9.

Announcements

Class slides are here
Day 12 solutions are here

February 25, 2022

Recap

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Snow Day!

February 28, 2022

Subsetting Data

Due Today

3. Data Wrangling (3.1-3.3) , Ismay, Chester and Albert Y. Kim (2021). Modern Dive: Statistical Inference via Data Science. CRC Press. URL: https://moderndive.com/ (visited on Jan. 14, 2022).

Optional Further Reading

No Readings

Announcements

Day 14 solutions are here
Class slides are here
MP1 due Friday!
If you struggled on quiz 2 be sure to study the y-axis labels on all of our frequency plots and pay attention to units of observation!
Trigger warning: This week’s lab will review data that demonstrates racial profiling in policing. We will be reproducing the NYCLU’s data analysis of stop and frisk in NYC in 2011.

March 02, 2022

Aggregating and Summarizing Data

Due Today

3. Data Wrangling (3.4-3.6) , Ismay, Chester and Albert Y. Kim (2021). Modern Dive: Statistical Inference via Data Science. CRC Press. URL: https://moderndive.com/ (visited on Jan. 14, 2022).

Optional Further Reading

No Readings

Announcements

Class slides are here

March 04, 2022

Lab: Exploratory Data Analysis

Due Today

No Readings

Mini-Project 1: Profile a dataset

Optional Further Reading

No Readings

Announcements

March 07, 2022

Importing and Cleaning Datasets

Due Today

5. Importing Data , Irizarry, Rafael A. (2022). Introduction to Data Science. Data Analysis and Prediction Algorithms with R. URL: https://rafalab.github.io/dsbook/ (visited on Jan. 14, 2022).
26. Parsing dates and times , Irizarry, Rafael A. (2022). Introduction to Data Science. Data Analysis and Prediction Algorithms with R. URL: https://rafalab.github.io/dsbook/ (visited on Jan. 14, 2022).

Optional Further Reading

25. String processing , Irizarry, Rafael A. (2022). Introduction to Data Science. Data Analysis and Prediction Algorithms with R. URL: https://rafalab.github.io/dsbook/ (visited on Jan. 14, 2022).

Announcements

Class slides are here
Topics and due dates have been updated since Friday’s discussion
Today’s office hours are in-person
First 15 minutes of office hours on Wednesday will be devoted to quiz 2 review.
Solutions for lab 18 will be posted by end of day.

March 09, 2022

Data Wrangling Practice

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

March 11, 2022

Lab: Tidying Datasets

Due Today

No Readings

Quiz 3

Optional Further Reading

Wickham, Hadley (2014). “Tidy Data”. En. In: Journal of Statistical Software 59, pp. 1-23. DOI: 10.18637/jss.v059.i10. URL: https://doi.org/10.18637/jss.v059.i10 (visited on Jan. 14, 2022).

Announcements

March 21, 2022

Pivoting Datasets

Due Today

6. Tidy Data , Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton (2021). Modern Data Science with R. 2nd. CRC Press. URL: https://mdsr-book.github.io/mdsr2e/ (visited on Jan. 14, 2022).

Optional Further Reading

No Readings

Announcements

Class slides are here

March 23, 2022

Joining Datasets

Due Today

5. Data wrangling on multiple tables , Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton (2021). Modern Data Science with R. 2nd. CRC Press. URL: https://mdsr-book.github.io/mdsr2e/ (visited on Jan. 14, 2022).

Optional Further Reading

No Readings

Announcements

Class slides are here

March 25, 2022

Practice Pivoting and Wrangling

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Class slides are here

March 28, 2022

Working with APIs

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Class slides are here
MP2 due Wednesday; last day to request an extension (by 5PM)
Quiz 4 due Wednesday (5PM)

March 30, 2022

Lab: Joining Data Extracted from the Web

Due Today

No Readings

Mini-Project 2: Wrangle a dataset

Quiz 4

Optional Further Reading

No Readings

Announcements

Class slides are here

April 01, 2022

Recap

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

April 04, 2022

Geographic Data and Spatial Projections

Due Today

17. Working with geospatial data (17.1-17.3) , Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton (2021). Modern Data Science with R. 2nd. CRC Press. URL: https://mdsr-book.github.io/mdsr2e/ (visited on Jan. 14, 2022).

Optional Further Reading

No Readings

Announcements

April 06, 2022

Mapping Point Data in Leaflet

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Class slides are here

April 08, 2022

Lab: Interpreting Maps

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

April 11, 2022

Mapping Polygon Data

Due Today

17. Working with geospatial data (17.4-17.8) , Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton (2021). Modern Data Science with R. 2nd. CRC Press. URL: https://mdsr-book.github.io/mdsr2e/ (visited on Jan. 14, 2022).

Optional Further Reading

No Readings

Announcements

April 13, 2022

Adding Layers to Maps

Due Today

No Readings

Quiz 5

Optional Further Reading

No Readings

Announcements

Class slides are here
Quiz 5 due at 5PM.
MP3 soft deadline this Friday.
Class will be in-person on Friday.

April 15, 2022

Lab: Chloropleth Maps

Due Today

No Readings

Mini-Project 3: Aqcuire

Optional Further Reading

No Readings

Announcements

Class slides are here

April 18, 2022

Functions and Iteration in R

Due Today

7. Iteration , Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton (2021). Modern Data Science with R. 2nd. CRC Press. URL: https://mdsr-book.github.io/mdsr2e/ (visited on Jan. 14, 2022).

Optional Further Reading

No Readings

Announcements

Class slides are here

April 20, 2022

Lab: Designing Functions

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Class slides are here

April 22, 2022

Recap

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

Class slides are here

April 25, 2022

Final Projects

Due Today

No Readings

Optional Further Reading

No Readings

Announcements

April 27, 2022

Final Projects

Due Today

No Readings

Quiz 6

Optional Further Reading

No Readings

Announcements

April 29, 2022

Final Projects

Due Today

No Readings

Mini-Project 4: Mapping Census Data

Quiz 7 due by the last day of finals

Optional Further Reading

No Readings

Announcements