+ - 0:00:00
Notes for current slide
Notes for next slide

Day Thirty-Six: Algorithmic Bias

SDS 192: Introduction to Data Science

Lindsay Poirier
Statistical & Data Sciences, Smith College

Spring 2022

1 / 13

What's in a name?

2 / 13

The amazing people working on algorithmic bias

...and many more!

3 / 13

AI Harms

Image from Algorithmic Justice League; Credit: Megan Smith (former Chief Technology Officer of the USA)

4 / 13

Fairness and Disparate Error

5 / 13

How does this happen?

  • Garbage in, garbage out
  • Proxy discrimination
  • Data Bias Diversion
6 / 13

Garbage In, Garbage Out...

...indicates instances when we build algorithms and other automated technologies on unrepresentative data.

7 / 13

Proxy Discrimination

  • Healthcare algorithm designed to determine which patients are in need of extra care
  • Researchers determined that at a given risk score, black patients tended to be much sicker than white patients
  • Algorithm used the amount of money patients had spent on healthcare as an indicator of health risk

What's the problem?

Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. "Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations." Science 366 (6464): 447–53. https://doi.org/10.1126/science.aax2342.

8 / 13

9 / 13

Predicting Child Neglect

  • Allegheny County PA Office of Child Youth and Families designed algorithm to predict when children were at higher risk of experiencing neglect
  • Aims to address bias in neglect determinations
  • Upon report, AFST scores likelihood (1-20) of neglect
  • 131 predictive indicators based on regression analysis of data warehouse with over a billion records on past victims of neglect, including:
    • receiving county health or mental health treatment;
    • being reported for drug or alcohol abuse;
    • accessing supplemental nutrition assistance program benefits, cash welfare assistance, or Supplemental Security Income;
    • living in a poor neighborhood;
    • interacting with the juvenile probation system
10 / 13

Proxy Discrimination

  • Assumes that bias happens in the screening phase, when studies show bias in referral stage
  • No actual data on neglect so algorithm relies on proxies
  • A quarter of the indicators are also indicators for poverty
    • Data about use of public services is more widely accessible so included more than private services
    • No indicators for private rehabs or mental health counseling
  • Algorithm "oversamples" the poor
11 / 13

The Data Bias Diversion

  • Belief that it is possible and desirable to do data science in a "neutral" or impartial way
  • Ignores:
    • We can't have datasets without making decisions regarding what counts and how.
    • Data landscapes are already inequitable.
    • Data science tools and methods have racist legacies.
    • "Models are opinions reflected in mathematics." - Cathy O'Neil
12 / 13

What's in a name?

2 / 13
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow