AAAI ICWSM 2024 | Buffalo, NY, USA | June 3rd

Tutorial: Collectivist and Perspectivist Approaches to Studying Online Toxicity

 

Summary: In 1903, a debate took place between two sociologists, Gabriel Tarde and Emile Durkheim, that is said to have meaningfully shaped the development of the discipline. While Tarde argued for what can be called a perspectivist approach to sociology, focused on centering the individual and illuminating their (dynamic) contextual artifacts, Durkheim rejected such ‘psychologism’ and insisted that sociology should focus on higher (system)-level structures and what he called social facts. Tarde lost the debate, with sociology proceeding in a Durkheimian fashion for the century that followed. Sociologist Bruno Latour is often credited, along with others like Gilles Deleuze, for the reintroduction of Tardean sociology in the early twenty-first century. In 2012, Latour expanded on how digital data, in particular, are well-suited for a Tardean approach to understanding social phenomena (what Latour might call socio-technical phenomena). Despite his recent revival in some corners of sociology and anthropology, few have followed up on Latour’s call for a Tardean approach to digital research. This tutorial will discuss specifics of the debate and will guide participants through hands-on exercises that demonstrate how the different approaches can be applied to analyze the use and spread of toxic language on YouTube, with connections made to literature on social norms and networks.
 
Duration: 4 hours total (70 min lecture and discussion, two 70 min hands-on exercises, two 15 min breaks)
 
Schedule: We’ll spend the first 70 minute session discussing the debate between Tarde and Durkheim, including the role that the development of statistics and mass media marketing played in the dominance of collectivist approaches to data in the twentieth century. We’ll discuss how digital data differ from data that social scientists have traditionally collected, and how snowball sampling, in particular, can be used to motivate a Tardean approach to digital data collection, with examples from prior studies that rely on such an approach without realizing their Tardean roots. We’ll discuss the implications that the two approaches have for the study of social norms, as well as how they suggest different approaches to data collection and analysis.
 
In two 70 minute hands-on sessions, we’ll then apply these ideas to study the spread of toxicity in YouTube comments. The sessions will walk all participants through the process of obtaining a YouTube developer account (and authentication codes) and to collect comment data from YouTube. Python Notebooks will be provided, which will facilitate data collection and analysis through both collectivist and perspectivist lenses.
 
Target audience, prerequisites, and outcomes: The tutorial will be accessible to any ambitious participant who is interested in new theoretical approaches to digital data collection and analysis. Both technically-minded and socially-minded scholars are encouraged to participate. There are no required prerequisites, and all material will be taught from the ground up. However, some familiarity with Python will be useful for following the Notebooks that will be provided. By the end of the tutorial, participants will understand the differences between perspectivist and collectivist approaches to data, will have a YouTube developer account set up and will know how to use authentication to collect YouTube data, and will understand how to apply collectivist and perspectivist approaches to study the use and spread of toxicity in YouTube comments.
 
Materials: The materials provided will include a set of slides that will be used to discuss the Tarde-Durkheim debate and differences between the perspectivist and collectivist approaches; a document outlining how to sign up for a YouTube developer account and obtain authentication codes; and Python Notebooks that collect and analyze YouTube comment data, including the labeling of comments for toxicity and other sentiment measures.