Simon Couch

Fair machine learning

Regular talk, 10:25-10:40

In recent years, high-profile analyses have called attention to many contexts where the use of machine learning deepened inequities in our communities. A machine learning model resulted in wealthy homeowners being taxed at a significantly lower rate than poorer homeowners; a model used in criminal sentencing disproportionately predicted black defendants would commit a crime in the future compared to white defendants; a recruiting and hiring model penalized feminine-coded words—like the names of historically women's colleges—when evaluating résumés. In late 2022, a group of Posit employees across teams, roles, and technical backgrounds formed a reading group to engage with literature on machine learning fairness, a research field that aims to define what it means for a statistical model to act unfairly and take measures to address that unfairness. We then designed functionality and resources to help data scientists measure and critique the ways in which the machine learning models they've built might disparately impact people affected by that model. This talk will introduce the research field of machine learning fairness and demonstrate a fairness-oriented analysis of a model with tidymodels, a framework for machine learning in R.

Pronouns: he/him

Chicago, IL

Simon Couch is a software engineer at Posit PBC (formerly RStudio) where he works on open source statistical software. With an academic background in statistics and sociology, Simon believes that principled tooling has a profound impact on our ability to think rigorously about data. He authors and maintains a number of R packages and blogs about the process at simonpcouch.com.