So there's a whole sub-field of machine learning that studies how to correctly perform data wrangling, sampling techniques, and sample weighting in order to remove spurious underlying trends and equitably sample the data.

So the machine-learning algorithms that replicate racism and classism in social media, computer vision, criminal justice, and the financial industry could be fixed by any competent data scientist. The people who made these algorithms are just bad at machine learning.

@ash Soon a "competent data scientist" will become as useful as a lawyer, helping less knowledgable folks through the maze of algorithms.

Some of the checks and balances IRL on profiling were implementation based, as in we cannot collect enough data for a large enough population. ML has been a force multiplier removing some of these checks. I do not think it is just a question of competence.

@alephnull There are a lot of techniques in both machine-learning and classical statistics to upsample sparsely-represented subpopulations in datasets, characterize latent variables that shouldn't influence the output of the algorithm, and minimize the negative effect of inadequate numbers of observations. There's good evidence that these techniques have been neglected in machine learning algorithms that materially affect the well-being of millions of people.


Agreed that there are techniques to "compensate" for any inadequacies that one can think of. The devil is in the details, where to draw the line. The table of contents of peer reviewed social science journals do not suggest that there is any sort of consensus on this. Rebutting the behavioural supremacists is hard enough.

Sign in to participate in the conversation

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!