Eric Chu

About

I'm currently a senior research scientist at Google, working on foundational models, Bard, and more.

I did my PhD at MIT, where I was advised by Deb Roy in the Media Lab and Jacob Andreas in CSAIL. During my PhD, I spent time at Facebook AI Research with Jason Weston and Stephen Roller, and Google Brain with Peter J. Liu. Before that, I spent one year as a data scientist at Facebook, working on moderation and human-AI interaction. I graduated from UC Berkeley, where I did some research on topological data analysis, bioinformatics, and biomedical imaging.

I work on both capabilities and safety/alignment AI research. I often use real-world applications to motivate advances in methods and understanding of how neural networks work. My ultimate goal is to aid human-AI collaboration at the individual and societal level. My work has sometimes intersected with computational social science, human-AI interaction, and cognitive science.

Latest News

May 2023: PaLM 2 Technical Report is out.
May 2023: Paper "Language models trained on media diets can predict public opinion" is cited in US Congressional hearing on AI. Timestamped video and Preprint.
Spring 2023 and Google I/O: Bard, Bard coding, DuetAI, AI-powered Colab, Search CodeTips are launched.
May 2022: I joined (Google) X, the Moonshot Factory.
Oct 2021: Paper on "Evolving Evocative 2D Views of Generated 3D Objects" accepted to NeurIPS creativity and design workshop. Link.
Sept 2021: Co-leading session on "Designing New Datasets for Social Reasoning & Theory of Mind" in MIT Meaning Representation workshop. Joint program across computer science, linguistics, brain and cognitive science departments. Pre-workshop notes and slides.
July 2021: Spatial and logical reasoning task accepted to BIG-bench, a benchmark for probing capabilities of large language models. Task and discussion. Task repeatedly highlighted in later scaling papers as difficult task.

Research Interests

Most recently, I've been focused on alignment, learning from human feedback, code LLMs, and planning (tool use, adaptive computation, memory).

*** The following was written in 2021. ***

With the goal of making machine learning more capable, flexible, and deployable, I have experience and interest in:

Generative sequential models: summarization, dialogue generation, language-guided sketch generation, text style transfer, text simplification, speech synthesis
Adapatability: compositional models, continual learning,
Social reasoning and theory of mind: personas, beliefs, pragmatics
Robustness and human-AI interaction: uncertainty estimation, quality of explanations, experimental design

My work is sometimes motivated by the way humans communicate, such as the use of stories and pragmatic implicature, the nature of mass media, social influences on belief formation, and cognitive biases broadly. I started my undergrad as a bioengineering major, and my first stints in research were related to biomedical imaging and protein structure prediction — I remain greatly interested in the potential for machine learning to advance these fields. Finally, I sometimes explore the use of computational tools for art and creative purposes.

Projects

As of Fall 2021, some of my active projects were centered around:

reasoning steps in large language models
the effect of model explanations in human collaboration and certification

I believe PhD students and the research community can benefit from more transparency into the non-linear path of research (both in individual projects and in career trajectories), and discussion of negative results. I'm a fan of efforts such as I Can't Believe It's Not Better ⤷, negative results in NLP ⤷, and ML retrospectives ⤷. In that spirit, some smaller, exploratory, incomplete, or negative-results projects I've worked on during my PhD include: (1) hierarichal natural language plans to improve generative models, applied to SketchRNN, (2) disparate impact on minority communities of classifiers that distinguish between human-written and machine-generated text, (3) detecting the provenance of toxic generations in language models, (4) incorporating symbolic rules in neural text simplification, (5) controllable speech synthesis to toggle between natural and tutoring speaking modes, and (6) semi-supervised satellite image segmentation for resource allocation. Several of these projects are described in more detail below.

Highlighted
AI and NLP
Social Science
All

Language models trained on media diets can predict public opinion
Media is important in shaping people’s beliefs and behaviors, but (a) traditional surveys for measuring public opinion are expensive, and (b) the tools for measuring media effects are limited. We develop a new approach to simultaneously solve these issues, based on probing language models. We validate our approach against ground-truth surveys in COVID-19 and other settings. More results and paper to come.

MeanSum, a neural model for unsupervised, multi-document abstractive summarization
Abstractive summarization models have typically required large, paired datasets. However, collecting these datasets is expensive, and existing datasets are typically news-related, which limits the transferrability of trained models to different domains. We consider the setting where there are only documents, and introduce the first end-to-end model for unsupervised, multi-document, abstractive summarization. During training, the summary is a discrete, latent variable that we optimize using the Gumbel-softmax trick. Our model is at least as good compared to extractive baselines when tested on several review datasets.

External memory networks for learning personas and domain adaptation
The ability to infer personas is useful in tasks ranging from personalized dialogue generation to computational narrative analysis. In our first work, "Learning Personas from Dialogue with Attentive Memory Networks", we introduce a persona prediction task where character tropes are paired with dialogues. We find that the use of an external "knowledge store" memory module, initialized with descriptions of character tropes, outperforms existing attention and read-write memory models. In a second, follow-up work -- DAPPER -- we extend the use of a knowledge-store memory and find that it improves performance across multiple persona domains, and is useful in downstream tasks like hate-speech detection.

Emotional arcs, outlined by audio-visual sentiment models, are significant predictors of engagement
Storytellers and narrative theories have long discussed the importance of the "emotional arc" in determining whether a story resonates with its audience. Concurrent work looked at extracting emotional arcs from text-based stories, but we instead create arcs for movies. We develop neural network-based visual and audio models, trained on both new and existing large-scale datasets, to create separate audio/visual arcs. Crowdsourced annotations of 30-second clips, as well as uncertainty measures of the audio-visual predictions, are used to create a combined arc. We then cluster the arcs by introducing a new method based on k-medioids and dynamic time warping, which fixes several shortcomings of previous work. Finally, we show on a corpus of 1400 short web videos that certain clusters of arcs are significant predictors of likes and comments.

Pollster

Probing neural language models of media consumption to predict public opinion.

MeanSum

A neural model for unsupervised, multi-document abstractive summarization.

External Memory Networks for Learning Personas

Introduced a character trope prediction task based on dialogue snippets.

Emotional arcs in movies

Audio-visual sentiment models are used to produce emotional arcs, which we find can be significant predictors of engagement.

Games for Fairness and Interpretability

Games with a purpose (GWAP) -based framework for generating adversarial training data and probing models for bias and faulty behavior.

DAPPER

Domain-adapted persona representations with external memory

School Reviews

Millions of parents post school reviews on GreatSchools.org, which in turn influences housing and school choice. We use gradient-based interpretability methods to identify features of schools that correlate with different measures of quality.

MAS.S10 - AI & Equality course

Co-organized course on ethical implications of AI — specifically how AI can both promote and impede equality in various domains (e.g. healthcare, law enforcement, labor markets, etc).

Improving generative models with hierarchical plans

Given a dataset of (sub)-instruction, trajectory pairs, we parse trajectories into a tree of subtasks by automatically assigning a natural language description to each subtask. Beyond improving accuracy of instruction following models, the proposed approach can be adapted for natural language generation, few-shot imitation learning, and structured exploration.

Adaptive Speech Synthesis for Children

Tutors modulate their voice when teaching and reading to children. We collected a . A LSTM-based grapheme-to-phoneme model, a

Satellite Image Segmentation

Motivated by the scarcity of labeled satellite image data, we test a semi-supervised approach by first predidcting labels at the superpixel level, and then performing label propagation to individual pixels. The ultimate goal was to enable targeted distribution of technologies, goods, and medical aid in India and Sub-Saharan Africa.

Protein Substrate Classification

A-domains determine amino acid sequences and hence protein structure. Given a sequence of A-domains, our SVM-based classifier uses a variety of sequence, UniProt, and alignemnt -based features to predict what substrate the resulting protein will bind to.

Evolved Views of Generated Surfaces

Deep generative models of 3D objects are relatively underexplored in the AI for art community. Motivated by anamorphic art, which change or only beceome recognizable upon certain viewing angles, we present a genetic algorithm -based method for jointly generating 3D models of objects and 2D renders at different viewing angles, with the process guided by ImageNet and CLIP -based models.

Artistic Influence GAN

Using WikiArt data on artistic influence graph, generate novel paintings using GAN that conditions on selected artists.

Pablo West

Tool for automatically juxtaposing lyrics over paintings based on semantic and stylistic content, inspired by this.

Pollster

Probing neural language models of media consumption to predict public opinion.

External Memory Networks for Learning Personas

Introduced a character trope prediction task based on dialogue snippets.

Emotional arcs in movies

Audio-visual sentiment models are used to produce emotional arcs, which we find can be significant predictors of engagement.

School Reviews

Human Atlas

Tool for crowdsourcing the publicly knowable graph of connections, for use in complex organizations.

Artistic Influence GAN

Using WikiArt data on artistic influence graph, generate novel paintings using GAN that conditions on selected artists.

Pollster

Probing neural language models of media consumption to predict public opinion.

MeanSum

A neural model for unsupervised, multi-document abstractive summarization.

External Memory Networks for Learning Personas

Introduced a character trope prediction task based on dialogue snippets.

Emotional arcs in movies

Audio-visual sentiment models are used to produce emotional arcs, which we find can be significant predictors of engagement.

Games for Fairness and Interpretability

Games with a purpose (GWAP) -based framework for generating adversarial training data and probing models for bias and faulty behavior.

DAPPER

Domain-adapted persona representations with external memory

School Reviews

MAS.S10 - AI & Equality course

Co-organized course on ethical implications of AI — specifically how AI can both promote and impede equality in various domains (e.g. healthcare, law enforcement, labor markets, etc).

Human Atlas

Tool for crowdsourcing the publicly knowable graph of connections, for use in complex organizations.

Improving generative models with hierarchical plans

Adaptive Speech Synthesis for Children

Tutors modulate their voice when teaching and reading to children. We collected a . A LSTM-based grapheme-to-phoneme model, a

Satellite Image Segmentation

Protein Substrate Classification

Evolved Views of Generated Surfaces

Artistic Influence GAN

Using WikiArt data on artistic influence graph, generate novel paintings using GAN that conditions on selected artists.

Pablo West

Tool for automatically juxtaposing lyrics over paintings based on semantic and stylistic content, inspired by this.

Topological Sculptures

GUI tool for creating interactive 3D-printable surfaces.

Publications

Most recent publications on Google Scholar.

Selected
All

Neural Language Models of Media Consumption can Predict Public Opinion

Eric Chu, Jacob Andreas, Steve Ansolabehere, Deb Roy

In submission to Nature Human Behavior.

Paper

PaLM 2: Technical Report

Paper

Are Visual Explanations Useful? A Case Study in Model-in-the-loop Prediction

Eric Chu, Deb Roy, Jacob Andreas

Preprint.

Paper

Games for Fairness and Interpretability

Eric Chu^‡, Nabeel Gillani^‡, Sneha Priscilla Makini

WWW'20: Proceedings of The Web Conference, Workshop on Data Science for Social Good

ICLR'20: International Conference on Learning Representations, Towards Trustworthy ML Workshop

Paper

MeanSum : A Neural Model for Unsupervised Multi-Document Abstractive Summarization

Eric Chu^‡, Peter J. Liu^‡

ICML'19: International Conference on Machine Learning

Paper Slides Poster Code Data

Learning Personas from Dialogue with Attentive Memory Networks

Eric Chu^‡, Prashanth Vijayaraghavan^‡, Deb Roy

EMNLP'18: Empirical Methods in Natural Language Processing

Paper Data

Audio-visual Sentiment Analysis for Learning Emotional Arcs in Movies

Eric Chu, Deb Roy

ICDM'17: International Conference of Data Mining

ICCV'17: International Conference of Computer Vision, Large Scale Movie Description Challenge

Paper Slides Data

Neural Language Models of Media Consumption can Predict Public Opinion

Eric Chu, Jacob Andreas, Steve Ansolabehere, Deb Roy

In submission to Nature Human Behavior.

Paper

PaLM 2: Technical Report

Paper

Are Visual Explanations Useful? A Case Study in Model-in-the-loop Prediction

Eric Chu, Deb Roy, Jacob Andreas

Preprint.

Paper

Evolving Evocative 2D Views of Generated 3D Objects

Eric Chu

NeurIPS'21: Neural Information Processing Systems, ML for Creativity and Design Workshop

Paper

Parents’ Online School Reviews Reflect Several Racial and Socioeconomic Disparities in K–12 Education

Nabeel Gillani, Eric Chu, Doug Beeferman, Rebecca Eynon, Deb Roy

AERA Open '21: American Educational Research Association Journal.

Paper

Games for Fairness and Interpretability

Eric Chu^‡, Nabeel Gillani^‡, Sneha Priscilla Makini

WWW'20: Proceedings of The Web Conference, Workshop on Data Science for Social Good

ICLR'20: International Conference on Learning Representations, Towards Trustworthy ML Workshop

Paper

DAPPER: Learning Domain-Adapted Persona Representation Using Pretrained BERT and External Memory

Prashanth Vijayaraghavan, Eric Chu, Deb Roy

AACL'20: Asia-Pacific Chapter of the Association for Computational Linguistics

Paper Slides Data

MeanSum : A Neural Model for Unsupervised Multi-Document Abstractive Summarization

Eric Chu^‡, Peter J. Liu^‡

ICML'19: International Conference on Machine Learning

Paper Slides Poster Code Data

Learning Personas from Dialogue with Attentive Memory Networks

Eric Chu^‡, Prashanth Vijayaraghavan^‡, Deb Roy

EMNLP'18: Empirical Methods in Natural Language Processing

Paper Data

Artistic Influence GAN

Eric Chu

NeurIPS'18: Neural Information Processing Systems, ML for Creativity and Design Workshop

Paper Poster

Audio-visual Sentiment Analysis for Learning Emotional Arcs in Movies

Eric Chu, Deb Roy

ICDM'17: International Conference of Data Mining

ICCV'17: International Conference of Computer Vision, Large Scale Movie Description Challenge

Paper Slides Data

Human Atlas: A Tool for Mapping Social Networks

Martin Saveski, Eric Chu, Soroush Vosoughi, Deb Roy

WWW'16: International Conference on the World Wide Web. 2016. (Demo)

Paper Video

CV

Full Resume in PDF.

Google 2022 -

Senior Research Scientist
MIT 2015 - 2021

Ph.D. Student
Media Lab, Laboratory for Social Machines
Facebook AI Research (FAIR) Summer 2019

Research Intern
Google Brain Summer and Fall 2018

Research Intern
Facebook 2015

Data Scientist
Ads Integrity Team
UC Berkeley 2010-2014

Undergraduate student
Electrical Engineering & Computer Science
Facebook Summer 2014

Data Scientist, Intern
Groups Product
Knewton Summer 2013

Software Engineer, Intern
Ed-tech Platform team
Oxford University Fall 2013

Visiting Student
Mathematics and Bioinformatics

This Jekyll template comes from Martin Saveki.