Researchers blur faces in seminal ImageNet dataset; object recognition algorithms maintain accuracyResearchers blur faces in seminal ImageNet dataset; object recognition algorithms maintain accuracy
Others claim they conducted similar research in the past, but their work was ignored
March 17, 2021
Others claim they conducted similar research in the past, and that their work was ignored
A version of one of the most important image datasets of all time now blurs every new face in its catalog.
ImageNet was first launched in 2012, featuring thousands of images of objects, people, and scenes – all carefully labeled by hand. Now including more than 14 million hand-annotated images, all of which are scraped from the web, ImageNet is still used by countless AI projects.
Researchers involved in maintaining ImageNet have updated the dataset to blur faces as new images are added, and are calling for the main ImageNet archive to follow suit.
The group of researchers maintaining ImageNet decided that, given the growth of AI and the concerns over surveillance and bias, it was time to start obscuring faces.
Hundreds of thousands of human faces were blurred, first by using Amazon’s AI service Rekognition to find the areas to be processed, and then by paying Mechanical Turk contract workers to double-check the effect.
A large number of images featured on ImageNet were not primarily about people, but included faces. For example, an image annotated as people playing volleyball would obviously include faces. The researchers said that AI systems trained on the data would ideally not need to see the faces to be able to learn what volleyball is.
"However, to the best of our knowledge, this has not been thoroughly analyzed," the group said in a newly published research paper. "By benchmarking various deep neural networks on original images and face-blurred images, we report insights about the effects of face blurring."
The results proved that systems could be trained on the ‘blurred’ dataset with less than a one percent drop in accuracy: "Some categories incur significantly larger accuracy drop, including categories with a large fraction of blurred area, and categories whose objects are often close to faces, e.g., mask and harmonica."
The researchers added: “Through extensive experiments, we demonstrate that training on face-blurred does not significantly compromise accuracy on both image classification and downstream tasks, while providing some privacy protection. Therefore, we advocate for face obfuscation to be included in ImageNet and to become a standard step in future dataset creation efforts.”
But the paper itself has been accused of a different kind of erasure. It "has left us disappointed and flummoxed," AI researcher Abeba Birhane said, noting numerous similarities between the research and her own work, published last year in collaboration with UnifyID chief scientist Vinay Prabhu.
"This work signifies a pattern of erasure of Black women's work and lack of responsibility from the computer vision community," she said, adding: "You might ask, why not just contact them in lieu of public grandstanding? Have you bothered to even contact the curators of the ImageNet dataset? We have. Numerous times."
About the Author(s)
You May Also Like