![]() |
University of Cambridge > Talks.cam > Language Technology Lab Seminars > Implicit Chain-of-Thought: Internalizing Reasoning in Language Models
Implicit Chain-of-Thought: Internalizing Reasoning in Language ModelsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Lucas Resck. Abstract: When leveraging language models for reasoning tasks, generating explicit chain-of-thought (CoT) steps is often crucial for high accuracy. In this work, drawing inspiration from how the human brain transitions from explicit, conscious, deliberate reasoning (System 2) to implicit, automatic, intuitive thinking (System 1), we seek to internalize explicit CoT reasoning within a model that directly produces the final answer, which we define as the implicit CoT paradigm. To realize implicit CoT, we found a simple yet effective method: starting with a model trained for explicit CoT reasoning, we gradually remove the intermediate steps and finetune the model. This approach enables a finetuned GPT -2 Small model to solve 20-by-20 multiplication with up to 99.5% accuracy, whereas standard training cannot solve beyond 4-by-4 multiplication. You can try our demo at https://huggingface.co/spaces/yuntian-deng/gpt2-multiplication —- Bio: Yuntian Deng is an assistant professor at the University of Waterloo and a visiting professor at NVIDIA under Prof. Yejin Choi. He was previously a postdoc at AI2 , also advised by Prof. Choi. He received his PhD from Harvard University under Prof. Alexander Rush and Prof. Stuart Shieber. His recent works include NeuralOS, Interactive Training, WildChat, and Implicit Chain-of-Thought. This talk is part of the Language Technology Lab Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCentre of Governance and Human Rights Events Algorithms and Complexity Seminar Topology SeminarOther talksKirk Public Lecture: Title TBC Juliet Mitchell Lecture: Revisiting the Awkward Relationship of Feminism and Anthropology Consumed by Time: Foodways and Cultural Negotiations in 17th-Century Florida, USA Naive Wisdom: Behavioral Evidence from Newborn Chicks Breaking the cycle: mechanisms that underpin proliferation-dormancy switches |