Damn racist sinks!” giggled friends TJ Fitzpatrick and Larry at the end of a YouTube video that went viral back in 2017. The sink in question belonged to Atlanta’s Marriot Hotel, which was hosting a sci-fi event when, on a toilet break, TJ and Larry realised the tap’s infrared sensor refused to dispense soap to black hands. “I tried all the soap dispensers in the restroom, there were maybe 10, and none of them worked,” TJ recalls. “Any time I went into that restroom, I had to have my friend get the soap for me…”
The two friends didn’t see this as anything too serious, just one of those odd kinks in the otherwise taut rope of technological progress. And, for the millions of people who shared the video on meme pages and message boards, TJ’s “racist sink” was an absurd joke. But without knowing it, TJ and Larry had exemplified a pernicious question at the heart of machine learning and tomorrow’s society: the question of the dataset. It is a question that may turn out to have life and death implications, and haunts everyone from programmers to artists to executives.
The dataset is a vital component in “machine learning”. That’s the process of literally teaching a programme to recognise, isolate or even invent features of our world. This encompasses a huge range of actual and potential technologies – from self-driving cars to electronic passport gates – and will become ubiquitous as technology grows ever more integrated into the minutiae of our daily lives. One technique of machine learning is an ingenious process called a GAN (a generative adversarial network), which you’ll have to understand, however darkly.
Join our new commenting forum
Join thought-provoking conversations, follow other Independent readers and see their replies