Data science involves applying a set of strategies to transform a recorded set of values into something from which we can glean knowledge and insight. This course will introduce you to concepts and methods from the field of data science, along with how to apply them in R. You will learn how to acquire, clean, wrangle, and visualize data. You will also learn best practices in data science workflows, such as code documentation and version control. Issues in data ethics will be addressed throughout the course.
Classes will be held on Mondays, Wednesdays, and Fridays from 10:50 AM to 12:05 PM. The first week of classes will be held virtually, and you can connect to our course via a link available on Moodle. Once in-person, we will meet in Ford 240.
There are no prerequisites, but a willingness to write code is necessary. Coding for the first time can be intimidating, but I intend do everything in my power to support you through the learning curve and to make things both fun and relevant in the process. I personally picked up most of my data science skills through a lot of trial-and-error, practice, and curiosity. My hope is that, in this course, you will learn through experimentation, along with independent and collaborative problem-solving. Honing these competencies will serve you as you move on to other courses in the SDS program and/or at Smith.
I am a cultural anthropologist that studies how civic data gets produced, how communities think about and interface with data, and how data infrastructure can be designed more equitably. My Ph.D. is in an interdisciplinary discipline called Science and Technology Studies - a field that studies the intricate ways science, technology, culture, and politics all co-constitute each other. I work on a number of collaborative research projects that leverage public data to deepen understanding of social and environmental inequities in the US, while also qualitatively studying the politics behind data gaps and inconsistencies. As an instructor, I prioritize active learning and often structure courses as flipped classrooms. You can expect in-class time to predominantly involve group activities and live problem-solving exercises.
The best way to get in touch with me is via Slack. I can't guarantee that I will respond to messages sent via email. If you have course related questions, I encourage you to ask them in the #sds-192-questions
channel. When discretion is needed, feel free to DM. Please reserve more formal concerns like grades or accommodation requests for an in-person (or in-person virtual) conversation.
During the week, I will try my best to answer all Slack messages within 24 hours of receiving them. Please note that to maintain my own work-life balance, I often don't answer Slack messages on the weekends. It's important that you plan when you start your assignments accordingly.
Student consultation hours are a great opportunity for us to chat about what you're learning in the course, clarify expectations on assignments, and review work in progress. I also love when students drop in to consultation hours to request book recommendations, discuss career or research paths, or just to say hi! I encourage each student in the course to join consultation hours at least once this semester. If you're unable to attend my consultation hours at the regularly scheduled time, there is link on Moodle to book a meeting with me.
A number of excellent textbooks introducing data science concepts and methods have been written in the past few years, including a few from faculty in the Smith SDS department. To accompany the topics we will cover each week, I will be selecting my favorite chapters from these books and posting them to Perusall. However, all three books we will engage in this course cover almost every topic we will address, so feel free to supplement your reading with corresponding chapters in the other books: especially if you find yourself drawn to the teaching and writing style in a certain book. All books are available for free online.
Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton. 2021. Modern Data Science with R. 2nd ed. CRC Press. https://mdsr-book.github.io/mdsr2e/.
Irizarry, Rafael A. 2022. Introduction to Data Science. Data Analysis and Prediction Algorithms with R. https://rafalab.github.io/dsbook/.
Ismay, Chester, and Albert Y. Kim. 2021. Modern Dive: Statistical Inference via Data Science. CRC Press. https://moderndive.com/.
Each week I will also list optional reading and resources in our course schedule that you may reference if you are struggling with a topic or if you wish to explore that topic further. I will update this list often throughout the semester.
This course will be graded via a standards-based assessment system.
This is a 4-credit course with 3 hours per week of in-classroom instructions. Smith expects students to devote 9 out-of-class hours per week to 4-credit classes. I have designed the course assignments and selected the course readings with this target in mind.
I will not be taking attendance in this course, and you do not need to inform me when you will be absent. If you are sick, please stay home. That said, I expect students to be present when possible, and consistent absences may impact your ability to earn full participation credit. If you miss a class, you should contact a peer to discuss what was missed. I won't have the capacity this semester to re-deliver missed material in office hours.
I understand that you will sometimes need to prioritize other things over meeting assignment deadlines (e.g. your health, wellness, families, communities, jobs, other coursework). My late policy attempts to balance flexibility with accountability. There is a 24-hour grace period on all mini-project assignments. There will be no penalties for submitting the mini-project within this 24-hour period, and you do not need to inform me that you intend to take the extra time. You can also request up to a 72-hour extension on any mini-project assignment, as long as you make that request at least 48 hours before the original assignment due date. You can request an extension by filling out the Extension Request form on Moodle, and I will confirm your extension on Slack. Beyond this, late assignments will not be accepted.
However, except under extenuating circumstances, there are no extensions for take-home quizzes. I encourage you to start quizzes early and submit what you have by the deadline.
Smith College expects all students to be honest and committed to the principles of academic and intellectual integrity in their preparation and submission of course work and examinations. Students and faculty at Smith are part of an academic community defined by its commitment to scholarship, which depends on scrupulous and attentive acknowledgement of all sources of information, and honest and respectful use of college resources.
Any cases of dishonesty or plagiarism will be reported to the Academic Honor Board. Examples of dishonesty or plagiarism include:
As the instructor for this course, I am committed to making participation in this course a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion. Examples of unacceptable behavior by participants in this course include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct.
As the instructor I have the right and responsibility to point out and stop behavior that is not aligned to this Code of Conduct. Participants who do not follow the Code of Conduct may be reprimanded for such behavior. Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the instructor.
All students and the instructor are expected to adhere to this Code of Conduct in all settings for this course: seminars, office hours, and over Slack.
This Code of Conduct is adapted from the Contributor Covenant, version 1.0.0, available here.
I hope that we can foster a collaborative and caring environment in this classroom: one that celebrates successes, respects individual strengths and weaknesses, demonstrates compassion for each other's struggles, and affirms diverse identities. Here are some ideas that I have for creating this environment in our course:
#sds-192-appreciation
Slack channel. #sds-192-questions
channel. Help each other out by answering questions when you can. Using the proper pronouns for our students is foundational to a safe, respectful classroom environment that creates a culture of trust. For information on pronouns and usage, please see the Office of Equity and Inclusion link here: Pronouns
It is my goal for everyone to succeed in this course. If you have personal circumstances that may impact your experience of our classroom, I encourage you to contact Office of Disability Services in College Hall 104 or at ods@smith.edu. The Office will generate a letter that indicates to me what kind of support you need and how I can make your classroom experience more accommodating. Once you have this letter, you are welcome to visit my office hours or email me to discuss ideas about how we can tailor the course accordingly. While you can request accommodations at any time, the sooner we start this conversation, the better. If you have concerns about the course that are not addressed through ODS, please contact me. At no point will I ask you to divulge details about your personal circumstances to me.
College life is stressful, and life outside of college can be overwhelming. It is my position that attending to your physical and mental health and well-being should be a top priority. I will remind you of this often throughout the quarter. I encourage you to schedule a time to talk with me if you are struggling with this course. If you, or anyone you know, is experiencing distress, there are numerous campus resources that can provide support via the Schacht Center. I can point you to these resources at any time throughout the quarter.
A trigger is a topic or image that can precipitate an intense emotional response. When common triggering topics are to be covered in this course, I will do my best to provide a trigger warning in advance of the discussion. However, I can't always anticipate triggers. With this in mind I've set up an anonymous form, available on Moodle, where you can indicate topics for which you would like me to provide a warning.
#general
: Course announcements (only I can post) #sds-192-discussions
: Share news articles and relevant opportunities#sds-192-questions
: Ask and answer questions about our course#sds-192-project-support
: Ask the class how to address a problem you are running into as you work on your mini-project#sds-192-appreciation
: Acknowledge colleagues that have been helpfulSmith's Spinelli Center offers SDS drop-in tutoring hours in Sabin-Reed 301 (or via Zoom) on Monday through Thursday 7-9 PM.