Blue Hours Seattle. 2022
Google Scholar / Semantic Scholar / Github / Twitter / LinkedIn / Instagram / CV / Blog
Yao Fu 符尧. yao.fu@ed.ac.uk
I am a Ph.D. student at the University of Edinburgh (2020-) with professor Mirella Lapata and currently a research intern at Allen Institute for AI.
I finished my M.S. at Columbia University (2018-2020) with professor John Cunningham and my B.S. at Peking University (2013-2018) with professor Yansong Feng.
Before Ph.D., I spent great time visiting professor Alexander Rush at Cornell Tech (2019-2020).
I study large-scale probabilistic generative models for human language.
In the era of large language models, my research focuses on specialized language models, complex reasoning, emergent abilities, and how to inject strong abilities to language models from first principles. My article on tracing emergent abilities to their sources is now an important roadmap about large language model evolution.
Before the LLM era, I studied latent variable models for language generation and structure prediction.
Selected Work
- [Blog Post 2022] How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources
- Yao Fu, Hao Peng and Tushar Khot
- Analysing sources of emergent abilities of Large Language Models from first principle.
- Hacker News top 3 trending.
- [Arxiv 2023] Specializing Smaller Language Models towards Multi-Step Reasoning. [paper]
- Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, and Tushar Khot
- Trading language model’s generic ability for specialized math chain-of-thought ability.
- [ICLR 2023] Complexity-Based Prompting for Multi-Step Reasoning. [paper][code]
- Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark and Tushar Khot
- State-of-the-art reasoning performance on math word problems by prompting GPT3 with instances of complex reasoning chains.
- [ICML 2022] Scaling Structured Inference with Randomization. [paper][code]
- Yao Fu, John P. Cunningham and Mirella Lapata
- A family of randomized dynamic programming algorithms for scaling up classical structured prediction algorithms of different inferences (partition, marginal, entropy, reparameterization) of structures (chains, trees, and general sum-product).
Preprints and Conference Publications
- [ICLR 2023] Decomposed Prompting: A Modular Approach for Solving Complex Tasks. [paper]
- Tushar Khot, Harsh Trivedi, Matthew Finlayson, Yao Fu, Kyle Richardson, Peter Clark and Ashish Sabharwal
- Decomposing complex task into simpler sub-tasks then solve each of them by prompting language models.
- [Arxiv 2022] Latent Topology Induction for Understanding Contextualized Representations. [paper]
- Yao Fu and Mirella Lapata
- Discovering hidden geometric structures of pretrained language models by unsupervised induction of a latent network.
- [TACL 2022] Data-to-text Generation with Variational Sequential Planning.[paper][code]
- Ratish Puduppully, Yao Fu, Mirella Lapata
- A latent planning model for generating very long document.
- [NAACL 2021] Noisy Labeled NER with Confidence Estimation. [paper][code]
- Kun Liu*, Yao Fu*, Chuanqi Tan, Mosha Chen, Ningyu Zhang, Songfang Huang and Sheng Gao. *Equal contribution.
- A confidence estimation method for estimating label noise in NER annotations and a training method based on partial marginalization according to estimated noise.
- [ICLR 2021] Probing BERT in Hyperbolic Spaces. [paper][code]
- Boli Chen*, Yao Fu*, Guangwei Xu, Pengjun Xie, Chuanqi Tan, Mosha Chen, Liping Jing. *Equal contribution.
- A Poincare probe for recovering hierarchical structures from contextualized representations. Applied to probing syntax and sentiment in BERT.
- [ICLR 2021] Prototypical Representation Learning for Relation Extraction. [paper][code]
- Ning Ding, Xiaobin Wang, Yao Fu, Guangwei Xu, Rui Wang, Pengjun Xie, Ying Shen, Fei Huang, Hai-Tao Zheng, Rui Zhang
- A representation learning method for embedding relation prototypes on hyperspheres. Applied to supervised, semi-supervised, and few-shot relational learning.
- [AAAI 2021] Nested Named Entity Recognition with Partially Observed TreeCRFs. [paper][code]
- Yao Fu*, Chuanqi Tan*, Mosha Chen, Songfang Huang, Fei Huang. *Equal contribution.
- A Masked Inside algorithm for efficient partial marginalization of TreeCRFs. Applied to Nested NER.
- [NeurIPS 2020] Latent Template Induction with Gumbel-CRFs. [paper][code]
- Yao Fu, Chuanqi Tan, Mosha Chen, Bin Bi, Yansong Feng and Alexander Rush.
- A Gumbel-FFBS algorithm for reparameterizing and relaxing CRFs. Applied to controllable text generation with latent templates.
- [NeurIPS 2019] Paraphrase Generation with Latent Bag of Words. [paper][code]
- Yao Fu, Yansong Feng and John Cunningham.
- A differentiable planning and realization model with latent bag of words by Gumbel-topK reparameterization. Applied to paraphrase generation.
- [INLG 2019] Rethinking Text Attribute Transfer: A Lexical Analysis. [paper][code]
- Yao Fu, Hao Zhou, Jiaze Chen and Lei Li.
- A series of text mining algorithms for discovering words with strong influence on classification. Applied to analysing text attribute transfer models.
- [NAACL 2018] Natural Answer Generation with Heterogeneous Memory. [paper]
- Yao Fu and Yansong Feng.
- An attention mechanism fusing information from different source of knowledge. Applied to answer sentence generation.
Workshop Publications
- [EMNLP FigLang 2022] Just DREAM about it: Figurative Language Understanding with DREAM-FLUTE. [paper][code]
- The Third Workshop on Figurative Language Processing. In conjunction with EMNLP 2022
- Yuling Gu, Yao Fu, Valentina Pyatkin, Ian Magnusson, Bhavana Dalvi Mishra and Peter Clark
- Ranked top 1 in the task leaderboard. A mental model utilizing scene elaboration for understanding figurative language.
Blog and Open Source
Teaching
- Peking University. Empirical Methods for Natural Language Processing. 2022 Spring.
- Guest lecture on Text Generation. Tought by Yansong Feng.
- University of Edinburgh. Natural Language Understanding. 2022 Spring.
- Teaching Assistant. Tought by Alexandra Birch, Frank Keller, and Laura Perez.
- University of Edinburgh. Probabilistic Modeling and Reasoning. 2022 Spring.
- Teaching Assistant. Tought by Michael Gutmann.
- Peking University. Empirical Methods for Natural Language Processing. 2021 Spring.
- Guest lecture on Text Generation. Tought by Yansong Feng.
- Alibaba DAMO Academy. Advanced Probabilistic Machine Learning Seminar. 2020 Spring.
- Columbia University. COMS 4995 Applied Machine Learning, 2019 Spring.
- Course Assistant. Tought by Andreas Muller.
Internships
- Jul 22 - . Allen Institute for Artificial Intelligence. Research Intern. Seattle.
- Jan 20 - Oct 20. Alibaba Damo Academy. Research Intern. Beijing and Hangzhou
- May 19 - Aug 19. Tencent AI Lab. Research Intern. Seattle
- Dec 17 - Aug 18. Bytedance AI Lab. Research Intern. Beijing