Replicating and auditing black-box Language Models.
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Panagiotis Fytas.
Advances in large language models have brought about exciting advancements in capabilities, but the commercialization of this technology has led to an increasing loss of transparency. State-of-the-art language models effectively operate as black boxes, with many things unknown about their training algorithms, data annotators, and pretraining data. I will cover a trio of recent works from my group that attempt to help us understand each of these components by replicating the RLHF training process (AlpacaFarm), probing LMs to identify whose opinions are being reflected in pretraining and RLHF data (OpinionQA), and providing provable guarantees of test set contamination in black-box language models.
This talk is part of the Language Technology Lab Seminars series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|