COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Language Technology Lab Seminars > Large language models for enabling constructive online conversations
Large language models for enabling constructive online conversationsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Panagiotis Fytas. NLP systems promise to disrupt society through applications in high-stakes social domains. However, current evaluation and development focus on tasks that are not grounded in specific societal implications, which can lead to societal harm. There is a need to evaluate and mitigate the societal harms and, in doing so, bridge the gap between the realities of application and how models are currently developed. In this talk, I will present recent work addressing these issues in the domain of online content moderation. In the first part, I will discuss online content moderation to enable constructive conversations about race. Content moderation practices on social media risk silencing the voices of historically marginalized groups. We find that both the most recent models and humans disproportionately flag posts in which users share personal experiences of racism. Not only does this censorship hinder the potential of social media to give voice to marginalized communities, but we also find that witnessing such censorship exacerbates feelings of isolation. We offer a path to reduce censorship through a psychologically informed reframing of moderation guidelines. These findings reveal how automated content moderation practices can help or hinder this effort in an increasingly diverse nation where online interactions are commonplace. In the second part, I will discuss how identified biases in models can be traced to the use-mention distinction, which is the difference between the use of words to convey a speaker’s intent and mention of words for quoting what someone said or pointing out properties of a word. Computationally modeling the use-mention distinction is crucial for enabling counterspeech to hate and misinformation. Counterspeech that refutes problematic content mentions harmful language but is not harmful itself. We show that even recent language models fail at distinguishing use from mention and that this failure propagates to downstream tasks. We introduce prompting mitigations that teach the use-mention distinction and show that they reduce these errors. Finally, I will discuss the big picture and other recent efforts to address these issues in different domains beyond content moderation, including education, emotional support, and public discourse about AI. I will reflect on how, by doing so, we can minimize the harms and develop and apply NLP systems for social good. This talk is part of the Language Technology Lab Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listscomputation neuroscience journal club Naked mole-rats Institute of Astronomy SeminarsOther talksWind farms and power companies: from test sites to fossil fuels divestment, 1990-2019 Reaction-diffusion as active matter Experimental Studies of Black Holes: Status & Prospects Optimal approaches with Zig Automatic Outlier Rectification via Optimal Transport In Conversation: Cultures of Enchantment |