Webbläsaren som du använder stöds inte av denna webbplats. Alla versioner av Internet Explorer stöds inte längre, av oss eller Microsoft (läs mer här: * https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Var god och använd en modern webbläsare för att ta del av denna webbplats, som t.ex. nyaste versioner av Edge, Chrome, Firefox eller Safari osv.

Porträtt på Isak Engdahl. Foto: Jakob Roséen.

Isak Engdahl

Doktorand

Porträtt på Isak Engdahl. Foto: Jakob Roséen.

Agreements ‘in the wild’ : Standards and alignment in machine learning benchmark dataset construction

Författare

  • Isak Engdahl

Summary, in English

This article presents an ethnographic case study of a corporate-academic group constructing a benchmark dataset of daily activities for a variety of machine learning and computer vision tasks. Using a socio-technical perspective, the article conceptualizes the dataset as a knowledge object that is stabilized by both practical standards (for daily activities, datafication, annotation and benchmarks) and alignment work – that is, efforts including forging agreements to make these standards effective in practice. By attending to alignment work, the article highlights the informal, communicative and supportive efforts that underlie the success of standards and the smoothing of tensions between actors and factors. Emphasizing these efforts constitutes a contribution in several ways. This article's ethnographic mode of analysis challenges and supplements quantitative metrics on datasets. It advances the field of dataset analysis by offering a detailed empirical examination of the development of a new benchmark dataset as a collective accomplishment. By showing the importance of alignment efforts and their close ties to standards and their limitations, it adds to our understanding of how machine learning datasets are built. And, most importantly, it calls into question a key characterization of the dataset: that it captures unscripted activities occurring naturally ‘in the wild’, as alignment work bleeds into moments of data capture.

Avdelning/ar

  • Sociologi

Publiceringsår

2024-04-01

Språk

Engelska

Publikation/Tidskrift/Serie

Big Data and Society

Volym

11

Issue

2

Dokumenttyp

Artikel i tidskrift

Förlag

SAGE Publications

Ämne

  • Information Systems, Social aspects

Nyckelord

  • alignment work
  • benchmark
  • dataset analysis
  • Ethnography of machine learning
  • in-the-wild
  • standards

Aktiv

Published

ISBN/ISSN/Övrigt

  • ISSN: 2053-9517