Research Interests
Building a bridge between languages and other modalities.
Publications [Google Scholar]
(#: students I mentored at Microsoft Research)
-
Multimodal Latent Language Modeling with Next-Token Diffusion
Yutao Sun#*, Hangbo Bao*, Wenhui Wang*, Zhiliang Peng*, Li Dong*, Shaohan Huang, Jianyong Wang, Furu Wei.
arXiv:2412.08635, 2024.
pdf bib code -
Differential Transformer
Tianzhu Ye#*, Li Dong*, Yuqing Xia*, Yutao Sun#*, Yi Zhu, Gao Huang, Furu Wei.
arXiv:2410.05258, 2024.
pdf bib code -
RedStone: Curating General, Code, Math, and QA Data for Large Language Models
RedStone v-team.
arXiv:2412.03398, 2024.
pdf bib code -
Data Selection via Optimal Control for Language Models
Yuxian Gu#, Li Dong, Hongning Wang, Yaru Hao, Qingxiu Dong#, Furu Wei, Minlie Huang.
arXiv:2410.07064, 2024.
pdf bib code -
Self-Boosting Large Language Models with Synthetic Preference Data
Qingxiu Dong#, Li Dong, Xingxing Zhang, Zhifang Sui, Furu Wei.
arXiv:2410.06961, 2024.
pdf bib -
You Only Cache Once: Decoder-Decoder Architectures for Language Models
Yutao Sun#*, Li Dong*, Yi Zhu, Shaohan Huang, Wenhui Wang, Shuming Ma, Quanlu Zhang, Jianyong Wang, Furu Wei.
Neural Information Processing Systems (NeurIPS), Oral, 2024.
pdf bib code -
Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models
Wenshan Wu, Shaoguang Mao, Yadong Zhang, Yan Xia, Li Dong, Lei Cui, Furu Wei.
Neural Information Processing Systems (NeurIPS), 2024.
pdf bib -
Multi-Head Mixture-of-Experts
Xun Wu, Shaohan Huang, Wenhui Wang, Shuming Ma, Li Dong, Furu Wei.
Neural Information Processing Systems (NeurIPS), 2024.
pdf bib -
Direct Preference Knowledge Distillation for Large Language Models
Yixing Li#, Yuxian Gu#, Li Dong, Dequan Wang, Yu Cheng, Furu Wei.
arXiv:2406.19774, 2024.
pdf bib code -
Towards Optimal Learning of Language Models
Yuxian Gu#, Li Dong, Yaru Hao, Qingxiu Dong#, Minlie Huang, Furu Wei.
arXiv:2402.17759, 2024.
pdf bib code -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Shuming Ma*, Hongyu Wang*, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei.
arXiv:2402.17764, 2024.
pdf bib -
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Haoran Li*, Qingxiu Dong#*, Zhengyang Tang*, Chaojun Wang*, Xingxing Zhang*, Haoyang Huang*, Shaohan Huang, Xiaolong Huang, Zeqiang Huang, Dongdong Zhang, Yuxian Gu#, Xin Cheng, Xun Wang, Si-Qing Chen, Li Dong, Wei Lu, Zhifang Sui, Benyou Wang, Wai Lam, Furu Wei.
arXiv:2402.13064, 2024.
pdf bib -
Kosmos-E: Learning to Follow Instruction for Robotic Grasping
Zhi Wang, Xun Wu, Shaohan Huang, Li Dong, Wenhui Wang, Shuming Ma, Furu Wei.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024.
pdf bib code -
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
Xichen Pan#, Li Dong, Shaohan Huang, Zhiliang Peng#, Wenhu Chen, Furu Wei.
International Conference on Learning Representations (ICLR), 2024.
pdf bib code -
Kosmos-2: Grounding Multimodal Large Language Models to the World
Zhiliang Peng#*, Wenhui Wang*, Li Dong*, Yaru Hao, Shaohan Huang, Shuming Ma, Furu Wei.
International Conference on Learning Representations (ICLR), 2024.
pdf bib code demo -
Knowledge Distillation of Large Language Models
Yuxian Gu#, Li Dong, Furu Wei and Minlie Huang.
International Conference on Learning Representations (ICLR), 2024.
pdf bib code -
BioCLIP: A Vision Foundation Model for the Tree of Life
Samuel Stevens, Jiaman Wu, Matthew J Thompson, Elizabeth G Campolongo, Chan Hee Song, David Edward Carlyn, Li Dong, Wasila M Dahdul, Charles Stewart, Tanya Berger-Wolf, Wei-Lun Chao, Yu Su.
International Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
pdf bib code model demo -
BitNet: Scaling 1-bit Transformers for Large Language Models
Hongyu Wang*, Shuming Ma*, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Ruiping Wang, Yi Wu, Furu Wei.
arXiv:2310.11453, 2023.
pdf bib -
Kosmos-2.5: A Multimodal Literate Model
Tengchao Lv, Yupan Huang, Jingye Chen, Lei Cui, Shuming Ma, Yaoyao Chang, Shaohan Huang, Wenhui Wang, Li Dong, Weiyao Luo, Shaoxiang Wu, Guoxin Wang, Cha Zhang, Furu Wei.
arXiv:2309.11419, 2023.
pdf bib -
Large Language Model for Science: A Study on P vs. NP
Qingxiu Dong#*, Li Dong*, Ke Xu*, Guangyan Zhou, Yaru Hao, Zhifang Sui, Furu Wei.
arXiv:2309.05689, 2023.
pdf bib code -
Retentive Network: A Successor to Transformer for Large Language Models
Yutao Sun#*, Li Dong*, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei.
arXiv:2307.08621, 2023.
pdf bib code -
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding*, Shuming Ma*, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Furu Wei.
arXiv:2307.02486, 2023.
pdf bib code -
Language Is Not All You Need: Aligning Perception with Language Models
Shaohan Huang*, Li Dong*, Wenhui Wang*, Yaru Hao*, Saksham Singhal*, Shuming Ma*, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi#, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei.
Neural Information Processing Systems (NeurIPS), 2023.
pdf bib MetaLM -
Augmenting Language Models with Long-Term Memory
Weizhi Wang#, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei.
Neural Information Processing Systems (NeurIPS), 2023.
pdf bib code -
Optimizing Prompts for Text-to-Image Generation
Yaru Hao*, Zewen Chi#*, Li Dong, Furu Wei.
Neural Information Processing Systems (NeurIPS), Spotlight, 2023.
pdf bib code demo -
Extensible Prompts for Language Models
Tao Ge, Jing Hu, Li Dong, Shaoguang Mao, Yan Xia, Xun Wang, Si-Qing Chen, Furu Wei.
Neural Information Processing Systems (NeurIPS), 2023.
pdf bib -
Pre-Training to Learn in Context
Yuxian Gu#, Li Dong, Furu Wei and Minlie Huang.
Association for Computational Linguistics (ACL), 2023.
pdf bib code -
A Length-Extrapolatable Transformer
Yutao Sun#, Li Dong, Barun Patra, Shuming Ma, Shaohan Huang, Alon Benhaim, Vishrav Chaudhary, Xia Song, Furu Wei.
Association for Computational Linguistics (ACL), 2023.
pdf bib code -
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Damai Dai#, Yutao Sun#, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, Furu Wei.
Findings of Association for Computational Linguistics (Findings of ACL), 2023.
pdf bib -
Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning
Barun Patra*, Saksham Singhal*, Shaohan Huang*, Zewen Chi#, Li Dong, Furu Wei, Vishrav Chaudhary, Xia Song.
Association for Computational Linguistics (ACL), 2023.
pdf bib -
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Furu Wei and Zhoujun Li.
Association for Computational Linguistics (ACL), 2023.
pdf bib -
Magneto: A Foundation Transformer
Hongyu Wang*, Shuming Ma*, Shaohan Huang, Li Dong, Wenhui Wang, Zhiliang Peng#, Yu Wu, Payal Bajaj, Saksham Singhal, Alon Benhaim, Barun Patra, Zhun Liu, Vishrav Chaudhary, Xia Song, Furu Wei.
International Conference on Machine Learning (ICML), 2023.
pdf bib code -
Semi-Offline Reinforcement Learning for Optimized Text Generation
Changyu Chen, Xiting Wang, Yiqiao Jin, Victor Ye Dong, Li Dong, Rui Yan, Jim Cao, Yi Liu.
International Conference on Machine Learning (ICML), 2023.
pdf bib -
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
Wenhui Wang*, Hangbo Bao#*, Li Dong*, Johan Bjorck, Zhiliang Peng#, Qiang Liu, Kriti Aggarwal, Owais Khan Mohammed, Saksham Singhal, Subhojit Som, Furu Wei.
International Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
pdf bib code VL-BEiT -
Non-Contrastive Learning Meets Language-Image Pre-Training
Jinghao Zhou#, Li Dong, Zhe Gan, Lijuan Wang, Furu Wei.
International Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
pdf bib -
Generic-to-Specific Distillation of Masked Autoencoders
Wei Huang*, Zhiliang Peng#*, Li Dong, Furu Wei, Jianbin Jiao, Qixiang Ye.
International Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
pdf bib code -
A Unified View of Masked Image Modeling
Zhiliang Peng#, Li Dong, Hangbo Bao#, Furu Wei, Qixiang Ye.
Transactions on Machine Learning Research, 2023.
pdf bib code -
Visually-Augmented Language Modeling
Weizhi Wang#, Li Dong, Hao Cheng, Haoyu Song#, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei.
International Conference on Learning Representations (ICLR), 2023.
pdf bib code -
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
Yuxin Fang#, Li Dong, Hangbo Bao#, Xinggang Wang, Furu Wei.
International Conference on Learning Representations (ICLR), Spotlight, 2023.
pdf bib code -
Prototypical Calibration for Few-shot Learning of Language Models
Zhixiong Han#, Yaru Hao, Li Dong, Yutao Sun#, Furu Wei.
International Conference on Learning Representations (ICLR), 2023.
pdf bib -
Structured Prompting: Scaling In-Context Learning to 1,000 Examples
Yaru Hao*, Yutao Sun#*, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei.
arXiv:2212.06713, 2022.
pdf bib code -
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers
Zhiliang Peng#, Li Dong, Hangbo Bao#, Qixiang Ye, Furu Wei.
arXiv:2208.06366, 2022.
pdf bib code -
Language Models are General-Purpose Interfaces
Yaru Hao*, Haoyu Song#*, Li Dong*, Shaohan Huang, Zewen Chi#, Wenhui Wang, Shuming Ma, Furu Wei.
arXiv:2206.06336, 2022.
pdf bib -
On the Representation Collapse of Sparse Mixture of Experts
Zewen Chi#, Li Dong, Shaohan Huang, Damai Dai#, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Furu Wei.
Neural Information Processing Systems (NeurIPS), 2022.
pdf bib code -
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
Hangbo Bao#*, Wenhui Wang*, Li Dong, Qiang Liu, Owais Khan Mohammed, Kriti Aggarwal, Subhojit Som, Furu Wei.
Neural Information Processing Systems (NeurIPS), 2022.
pdf bib code -
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao#, Li Dong, Songhao Piao, Furu Wei.
International Conference on Learning Representations (ICLR), Oral, 2022.
pdf bib code -
AdaPrompt: Adaptive Model Training for Prompt-based NLP
Yulong Chen, Yang Liu, Li Dong, Shuohang Wang, Chenguang Zhu, Michael Zeng, Yue Zhang.
Findings of Empirical Methods in Natural Language Processing (Findings of EMNLP), 2022.
pdf bib -
CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation
Jian Yang, Shaohan Huang, Shuming Ma, Yuwei Yin, Li Dong, Dongdong Zhang, Hongcheng Guo, Zhoujun Li, Furu Wei.
Findings of Empirical Methods in Natural Language Processing (Findings of EMNLP), 2022.
pdf bib -
Knowledge Neurons in Pretrained Transformers
Damai Dai#, Li Dong, Yaru Hao#, Zhifang Sui, Furu Wei.
Association for Computational Linguistics (ACL), 2022.
pdf bib -
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
Zewen Chi#*, Shaohan Huang*, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei.
Association for Computational Linguistics (ACL), 2022.
pdf bib -
StableMoE: Stable Routing Strategy for Mixture of Experts
Damai Dai#, Li Dong, Shuming Ma, Bo Zheng#, Zhifang Sui, Baobao Chang, Furu Wei.
Association for Computational Linguistics (ACL), 2022.
pdf bib code -
Controllable Natural Language Generation with Contrastive Prefixes
Jing Qian#, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen.
Findings of Association for Computational Linguistics (Findings of ACL), 2022.
pdf bib -
CLIP Models are Few-Shot Learners: Empirical Studies on VQA and Visual Entailment
Haoyu Song#, Li Dong, Weinan Zhang, Ting Liu, Furu Wei.
Association for Computational Linguistics (ACL), 2022.
pdf bib -
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption
Tianyu Chen*, Hangbo Bao#*, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, Jianxin Li, Furu Wei.
Findings of Association for Computational Linguistics (Findings of ACL), 2022.
pdf bib -
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo.
International Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
pdf bib code -
Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training
Bo Zheng#, Li Dong, Shaohan Huang, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song and Furu Wei.
Empirical Methods in Natural Language Processing (EMNLP), 2021.
pdf bib code -
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
Zewen Chi#, Li Dong, Shuming Ma, Shaohan Huang, Xian-Ling Mao, Heyan Huang, Furu Wei.
Empirical Methods in Natural Language Processing (EMNLP), 2021.
pdf bib -
Zero-shot Cross-lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
Guanhua Chen, Shuming Ma, Yun Chen, Li Dong, Dongdong Zhang, Jia Pan, Wenping Wang, Furu Wei.
Empirical Methods in Natural Language Processing (EMNLP), 2021.
pdf bib -
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
Zewen Chi#, Li Dong, Bo Zheng#, Shaohan Huang, Xian-Ling Mao, Heyan Huang, Furu Wei.
Association for Computational Linguistics (ACL), 2021.
pdf bib code -
Consistency Regularization for Cross-Lingual Fine-Tuning
Bo Zheng#, Li Dong, Shaohan Huang, Wenhui Wang, Zewen Chi#, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei.
Association for Computational Linguistics (ACL), 2021.
pdf bib code -
Learning to Sample Replacements for ELECTRA Pre-Training
Yaru Hao#, Li Dong, Hangbo Bao#, Ke Xu, Furu Wei.
Findings of Association for Computational Linguistics (Findings of ACL), 2021.
pdf bib code -
Memory-Efficient Differentiable Transformer Architecture Search
Yuekai Zhao#, Li Dong, Yelong Shen, Zhihua Zhang, Furu Wei, Weizhu Chen.
Findings of Association for Computational Linguistics (Findings of ACL), 2021.
pdf bib -
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei.
Findings of Association for Computational Linguistics (Findings of ACL), 2021.
pdf bib code -
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains
Yunzhi Yao, Shaohan Huang, Wenhui Wang, Li Dong, Furu Wei.
Findings of Association for Computational Linguistics (Findings of ACL), 2021.
pdf bib code -
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training
Zewen Chi#, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, Heyan Huang, Ming Zhou.
North American Association for Computational Linguistics (NAACL), 2021.
pdf bib code -
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
Yaru Hao#, Li Dong, Furu Wei, Ke Xu.
AAAI Conference on Artificial Intelligence (AAAI), 2021.
pdf bib code -
DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei
arXiv:2106.13736, 2021.
pdf bib code -
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Shuming Ma, Jian Yang, Haoyang Huang, Zewen Chi, Li Dong, Dongdong Zhang, Hany Hassan Awadalla, Alexandre Muzio, Akiko Eriguchi, Saksham Singhal, Xia Song, Arul Menezes, Furu Wei.
arXiv:2012.15547, 2020.
pdf bib code -
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Hangbo Bao#, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon.
International Conference on Machine Learning (ICML), 2020.
pdf bib code -
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, Ming Zhou.
Neural Information Processing Systems (NeurIPS), 2020.
pdf bib code -
Cross-Lingual Natural Language Generation via Pre-Training
Zewen Chi#, Li Dong, Furu Wei, Wenhui Wang, Xian-Ling Mao, Heyan Huang.
AAAI Conference on Artificial Intelligence (AAAI), 2020.
pdf code bib -
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li, Xi Yin, Chunyuan Li, Xiaowei Hu, Pengchuan Zhang, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, Yejin Choi, Jianfeng Gao.
European Conference on Computer Vision (ECCV), 2020.
pdf bib code blog -
Harvesting and Refining Question-Answer Pairs for Unsupervised QA
Zhongli Li#, Wenhui Wang, Li Dong, Furu Wei, Ke Xu.
Association for Computational Linguistics (ACL), 2020.
pdf code bib -
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong*, Nan Yang*, Wenhui Wang*, Furu Wei*, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon.
Neural Information Processing Systems (NeurIPS), 2019.
pdf code bib -
Visualizing and Understanding the Effectiveness of BERT
Yaru Hao#, Li Dong, Furu Wei, Ke Xu.
Empirical Methods in Natural Language Processing (EMNLP), 2019.
pdf bib -
Data-to-text Generation with Entity Modeling
Ratish Puduppully, Li Dong, Mirella Lapata.
Association for Computational Linguistics (ACL), 2019.
pdf code bib -
Learning to Ask Unanswerable Questions for Machine Reading Comprehension
Haichao Zhu#, Li Dong, Furu Wei, Wenhui Wang, Bing Qin, Ting Liu.
Association for Computational Linguistics (ACL), 2019.
pdf data bib -
Data-to-Text Generation with Content Selection and Planning
Ratish Puduppully, Li Dong, Mirella Lapata.
AAAI Conference on Artificial Intelligence (AAAI), 2019.
pdf code data bib -
Coarse-to-Fine Decoding for Neural Semantic Parsing
Li Dong, Mirella Lapata.
Association for Computational Linguistics (ACL), 2018.
pdf code bib -
Confidence Modeling for Neural Semantic Parsing
Li Dong, Chris Quirk, Mirella Lapata.
Association for Computational Linguistics (ACL), 2018.
pdf code bib -
Learning to Paraphrase for Question Answering
Li Dong, Jonathan Mallinson, Siva Reddy, Mirella Lapata.
Empirical Methods in Natural Language Processing (EMNLP), 2017.
pdf bib -
Learning to Generate Product Reviews from Attributes
Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, Ke Xu.
European Chapter of the Association for Computational Linguistics (EACL), 2017.
pdf code data bib -
Language to Logical Form with Neural Attention
Li Dong, Mirella Lapata.
Association for Computational Linguistics (ACL), 2016.
pdf code bib slides -
Long Short-Term Memory-Networks for Machine Reading
Jianpeng Cheng, Li Dong, Mirella Lapata.
Empirical Methods in Natural Language Processing (EMNLP), 2016.
pdf code bib -
Solving and Generating Chinese Character Riddles
Chuanqi Tan, Furu Wei, Li Dong, Weifeng Lv, Ming Zhou.
Empirical Methods in Natural Language Processing (EMNLP), 2016.
pdf bib -
Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction
Yichun Yin, Furu Wei, Li Dong, Kaimeng Xu, Ming Zhang, Ming Zhou.
International Joint Conference on Artificial Intelligence (IJCAI), 2016.
pdf bib -
Question Answering over Freebase with Multi-Column Convolutional Neural Networks
Li Dong, Furu Wei, Ming Zhou, Ke Xu.
Association for Computational Linguistics (ACL), 2015.
pdf bib slides -
A Hybrid Neural Model for Type Classification of Entity Mentions
Li Dong, Furu Wei, Hong Sun, Ming Zhou, Ke Xu.
International Joint Conference on Artificial Intelligence (IJCAI), 2015.
pdf bib slides -
Ranking with Recursive Neural Networks and Its Application to Multi-document Summarization
Ziqiang Cao, Furu Wei, Li Dong, Sujian Li, Ming Zhou.
AAAI Conference on Artificial Intelligence (AAAI), 2015.
pdf bib -
Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification
Li Dong, Furu Wei, Chuanqi Tan, Duyu Tang, Ming Zhou, Ke Xu.
Association for Computational Linguistics (ACL), Short paper, 2014.
pdf data bib -
Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis
Li Dong, Furu Wei, Ming Zhou, Ke Xu.
AAAI Conference on Artificial Intelligence (AAAI), 2014.
pdf bib slides -
A Joint Segmentation and Classification Framework for Sentiment Analysis
Duyu Tang, Furu Wei, Bing Qin, Li Dong, Ting Liu, Ming Zhou.
Empirical Methods in Natural Language Processing (EMNLP), 2014.
pdf bib -
The Automated Acquisition of Suggestions from Tweets
Li Dong, Furu Wei, Yajuan Duan, Xiaohua Liu, Ming Zhou, Ke Xu.
AAAI Conference on Artificial Intelligence (AAAI), 2013.
pdf slides data bib -
MoodLens: An Emoticon-Based Sentiment Analysis System for Chinese Tweets
Jichang Zhao, Li Dong, Junjie Wu, Ke Xu.
Knowledge Discovery and Data Mining (KDD), Demo paper, 2012.
pdf demo poster video data bib
-
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
Jian Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Li Dong, Shaohan Huang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei.
Sixth Conference on Machine Translation (WMT), 2021.
pdf bib code (The system submission is based on DeltaLM.) -
Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension
Hangbo Bao#, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Lei Cui, Songhao Piao, Ming Zhou.
Proceedings of the 2nd Workshop on Machine Reading for Question Answering (MRQA), 2019.
pdf bib code -
Splusplus: A Feature-Rich Two-stage Classifier for Sentiment Analysis of Tweets
Li Dong, Furu Wei, Yichun Yin, Ming Zhou, Ke Xu.
International Workshop on Semantic Evaluation (SemEval), 2015.
pdf bib
-
DeepNet: Scaling Transformers to 1,000 Layers
Hongyu Wang*, Shuming Ma*, Li Dong, Shaohan Huang, Dongdong Zhang, Furu Wei.
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024.
pdf bib code -
Generic-to-Specific Distillation of Masked Autoencoders
Wei Huang*, Zhiliang Peng#*, Li Dong, Furu Wei, Jianbin Jiao, Qixiang Ye.
IEEE Transactions on Circuits and Systems for Video Technology, 2024.
pdf bib code -
Transforming Wikipedia into Augmented Data for Query-Focused Summarization
Haichao Zhu#, Li Dong, Furu Wei, Bing Qin, Ting Liu.
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2022.
pdf bib -
Adaptive Multi-Compositionality for Recursive Neural Network Models
Li Dong, Furu Wei, Ke Xu, Shixia Liu, Ming Zhou.
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2015.
online -
A Statistical Parsing Framework for Sentiment Classification
Li Dong, Furu Wei, Shujie Liu, Ming Zhou, Ke Xu.
Computational Linguistics, 2015.
pdf bib slides -
A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification
Duyu Tang, Bing Qin, Furu Wei, Li Dong, Ting Liu, Ming Zhou.
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2015.
online -
Unraveling the origin of exponential law in intra-urban human mobility
Xiao Liang, Jichang Zhao, Li Dong, Ke Xu.
Scientific Reports, 2013.
online -
Performance of Local Information Based Link Prediction: A Sampling Perspective
Jichang Zhao, Xu Feng, Li Dong, Xiao Liang, Ke Xu.
Journal of Physics A: Mathematical and Theoretical. 45 345001, 2012.
online
-
Learning Natural Language Interfaces with Neural Models
The University of Edinburgh, 2019.
online AIMatters (invited)
Experience
-
Dec. 2018 - Now, Natural Language Computing Group, Microsoft Research Asia. Principal Researcher.
-
Sept. 2015 - July 2019, Institute for Language, Cognition and Computation (ILCC), University of Edinburgh. Supervisor: Mirella Lapata.
-
June 2017 - Aug. 2017, Natural Language Processing Group, Microsoft Research Redmond. Mentor: Chris Quirk.
-
May 2012 - Sept. 2015, Natural Language Computing Group, Microsoft Research Asia. Mentor: Furu Wei.
-
Oct. 2010 - Sept. 2015, State Key Laboratory of Software Development Environment, Beihang University. Supervisor: Ke Xu.
Honors and Awards
-
CVPR-24 Best Student Paper Award, 2024
-
AAAI-21 Best Paper Runner Up, 2021
-
2019 AAAI/ACM SIGAI Doctoral Dissertation Award (Runner Up), 2020
-
ACL-18 Best Paper Honourable Mention, 2018
-
Adeptmind Scholar Fellowship, 2018
-
Microsoft Research Asia Fellowship, 2015
-
Graduate Student Scholarship 1st, 2012
-
AirBus Scholarship, 2011
-
National Scholarship, 2010/2014
-
Qian-Shen Scholarship 1st, 2009
Professional Activities
-
Area Chair: EMNLP 2019 (Sentence-level Semantics), EMNLP 2020 (Semantics: Sentence-level Semantics, Textual Inference and Other areas), NAACL 2021 (Machine Learning for NLP: Language Modeling and Sequence to Sequence Models), ACL 2021 (Question Answering), EMNLP 2022 (Machine Learning for NLP; NLP Applications; Semantics: Lexical, Sentence level, Textual Inference and Other areas), ACL 2023 (Semantics: Sentence-level Semantics, Textual Inference, and Other Areas), NeurIPS 2023, COLM 2024, NeurIPS 2024, COLING 2025 (Natural Language Generation, Summarization, and Simplification), ICLR 2025, ICML 2025
-
Senior Area Chair: ACL 2025 (Machine Learning for NLP), IJCNLP-AACL 2023 (Multimodality: Speech, Vision, Robotics, and Beyond)
-
Action Editor: ACL Rolling Review (NAACL 2022 Outstanding Action Editor), Transactions on Machine Learning Research
-
Senior Program Committee: IJCAI 2021, IJCAI 2025
-
Session Chair: EMNLP 2019 (Sentence-level Semantics), AACL 2020 (Machine Learning for NLP)
-
Program Committee / Reviewer: ACL (ACL-18 Outstanding Reviewer), EMNLP, NAACL, ICML, NeurIPS, ICLR, AAAI, IJCAI, COLING, CL, TACL (Standing Reviewer: 2020-2023), ACL Rolling Review, TPAMI, CVPR, ECCV, Nature Machine Intelligence, TKDE, etc.