Introductions

SDS 192: Introduction to Data Science

Professor Lindsay Poirier

What is data science?: Common view

  • interdisciplinary field combining computer science, mathematics/statistics, and domain expertise to extract meaningful information from unstructured data points

Hckum, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

What is data science?: My view

  • interdisciplinary field combining computer science, mathematics/statistics, and domain expertise to extract meaningful information from unstructured data points
  • also involves art, design, hermeneutics, communication, and ability to grapple with ethical dilemmas

Hckum, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

Case Study 1: ACLU Fights Discriminatory Housing

  • American Civil Liberties Union employs data scientists to produce insights regarding discriminatory laws and practices
  • Findings are presented in courts, legislatures, and public reports
  • In this study, they use public data to show that eviction screening policies can deny fair housing to Black renters, and particularly Black women.

Case Study 2: Cal EPA Tracks Environmental Injustice

  • The California Environmental Protection Agency hires data scientists to produce insights regarding environmental health risks
  • Findings implicate environmental policies, funding allocations, and legal actions against states and industries
  • This tool, visualizes environmental and demographic indicators to highlight communities experiencing environmental injustices in California.

Case Study 3: Geena Davis Institute Studies Gender Biases in Films

  • Geena Davis Institute collaborated with University of Southern California’s Signal Analysis and Interpretation Laboratory (SAIL)
  • Developed a machine learning tool to measure representation of diverse groups in films by studying screen time and speaking

Topics covered in this course

  • data visualization
  • data wrangling
  • programming with data
  • mapping
  • data retrieval
  • data science infrastructures and workflows
  • data science ethics

Who is the professor? Why is an anthropologist teaching data science?

  • Please call me Lindsay (preferred), Professor Poirier, or Dr. Poirier
  • Assistant Professor of SDS and cultural anthropologist
  • Previously Assistant Professor of Science and Technology Studies at UC Davis
  • Lab Manager at BetaNYC
  • M.S./Ph.D. in Science and Technology Studies from Rensselaer Polytechnic Institute
  • B.S. in Information Technology and Web Science from Rensselaer Polytechnic Institute
  • Dancing, crafting, cooking, reading, learning French
  • I have a very spunky dog Madison.

Exercise

Demonstration: Find the class year for which students on average drink the most cups of coffee per day. Repeat for the remaining class years. Arrange class years by order of most average cups of coffee per day. Plot the results. How could we perform the same task for every class on campus?

Ethics Framework

  • What assumptions and commitments informed the design of this dataset?
  • Who has had a say in data collection and analysis regarding this dataset? Who has been excluded?
  • What are the benefits and harms of this dataset, and how are they distributed amongst diverse social groups?

Coding can be intimidating!

  • Coding is like learning a new language. When you are first learning it, it all feels completely unfamiliar. I will work to support you in building the vocabulary and syntax to code in R.
  • Coding can be frustrating. I regularly lose hours of my day in trying to find bugs in my code. I will work to give you resources and skills to navigate coding frustrations.
  • Coding social environments have historically been exclusionary. I will work to reduce barriers to coding in whatever ways I can.

Syllabus Highlights

  • Attendance/Extensions
  • Academic Integrity and Generative AI
  • Accommodations
  • Mental Health and Wellness

Prepping for this Class

  • Navigating Course Website
  • Grading
  • Perusall
  • Slack

For Monday

  • Install Slack Desktop and set notifications
  • Fill out first day of class questionnaire
  • Create a GitHub account if you don’t have one
  • Accept SDS 192 Labs Assignment in Moodle
  • Let me know if you will be using a Chromebook, asap