BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Neural Code Comprehension: A Learnable Representation of Code Sema
 ntics - Tal Ben-Nun\, ETH Zurich
DTSTART:20190228T130000Z
DTEND:20190228T140000Z
UID:TALK120598@talks.cam.ac.uk
CONTACT:Microsoft Research Cambridge Talks Admins
DESCRIPTION:In the era of “Big Code”\, research is being conducted int
 o automating the understanding of computer programs. Most of the current w
 orks base on techniques from Natural Language Processing and Deep Learning
 \, which have been successful recently\, attempting to process the code di
 rectly or using syntactic representations (e.g.\, ASTs and AST paths). How
 ever\, to comprehend program semantics robustly\, structural features of c
 ode have to be taken into account as well\, including function calls\, bra
 nching\, and interchangeable order of statements. In this talk\, I will pr
 esent a novel processing technique to use Machine Learning for code semant
 ics\, and show how it applies to a variety of program analysis tasks. In p
 articular\, we stipulate that a robust distributional hypothesis of code a
 pplies to both human- and machine-generated programs. Following this hypot
 hesis\, we define an embedding space\, inst2vec\, based on an Intermediate
  Representation (IR) of the code that is independent of the source program
 ming language. We provide a novel definition of contextual flow for this I
 R\, leveraging both the underlying data- and control-flow of the program. 
 We then analyze the embeddings quantitatively using analogies and clusteri
 ng\, and evaluate the learned representation on three different high-level
  tasks. We show that even without fine-tuning\, a single Recurrent Neural 
 Network (RNN) architecture and fixed inst2vec embeddings outperform specia
 lized approaches for performance prediction (compute device mapping\, opti
 mal thread coarsening)\; and algorithm classification from raw code (104 c
 lasses)\, where we set a new state-of-the-art.
LOCATION:Auditorium\, Microsoft Research Ltd\, 21 Station Road\, Cambridge
 \, CB1 2FB
END:VEVENT
END:VCALENDAR
