![]() |
COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > Causal Inference Reading Group > Discussion on Causal Representation Learning with Generative Artificial Intelligence: Application to Texts as Treatments
![]() Discussion on Causal Representation Learning with Generative Artificial Intelligence: Application to Texts as TreatmentsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Martina Scauda. See preprint by Kosuke Imai and Kentaro Nakamura at : https://arxiv.org/abs/2410.00903 In this paper, we demonstrate how to enhance the validity of causal inference with unstructured high-dimensional treatments like texts, by leveraging the power of generative Artificial Intelligence. Specifically, we propose to use a deep generative model such as large language models (LLMs) to efficiently generate treatments and use their internal representation for subsequent causal effect estimation. We show that the knowledge of this true internal representation helps disentangle the treatment features of interest, such as specific sentiments and certain topics, from other possibly unknown confounding features. Unlike the existing methods, our proposed approach eliminates the need to learn causal representation from the data and hence produces more accurate and efficient estimates. We formally establish the conditions required for the nonparametric identification of the average treatment effect, propose an estimation strategy that avoids the violation of the overlap assumption, and derive the asymptotic properties of the proposed estimator through the application of double machine learning. Finally, using an instrumental variables approach, we extend the proposed methodology to the settings, in which the treatment feature is based on human perception rather than is assumed to be fixed given the treatment object. The proposed methodology is also applicable to text reuse where an LLM is used to regenerate the existing texts. We conduct simulation and empirical studies, using the generated text data from an open-source LLM , Llama 3, to illustrate the advantages of our estimator over the state-of-the-art causal representation learning algorithms. This talk is part of the Causal Inference Reading Group series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listswomen@CL all The Cambridge Modern and Contemproary Art Seminar Series South Africa Music DownloadOther talksReproductive Justice in a Changing Climate: panel discussion and exhibition (CAMBRIDGE FESTIVAL) Cambridge Reading Group on Reproduction - March 2025 Next Generation Sequencing Natural Materials for Musical Instruments 2025 Scott Lectures - Quantum science with atom-like systems in diamond Metamorphosis of the Self - A journey of Inner Awakening (online talk) |