Xingxing Zhang (张星星)
Pronounced as Shingshing Zhang
E-mail:
Address :
Building 2, No. 5 Dan Ling Street,
Haidian District,
Beijing , 100080
China
Hi! I am a Principal Researcher at General Artificial Intelligence Group , Microsoft Research Asia (MSRA).
Before joining MSRA, I completed my Ph.D. at Institute for Language, Cognition, and Computation in School of Informatics , University of Edinburgh ,
where I was working on Language Modeling and Generation under the supervision of Prof. Mirella Lapata and Prof. Adam Lopez .
I am always looking for highly motivated interns to work with me on Large Language Models. Feel free to drop me an email, if you are interested.
Research Interests
Large Language Models (LLMs), Post-training, Complex Reasoning, Synthetic Data and Scalable Oversight
Publications
Google Scholar
Scaling Laws of Synthetic Data for Language Models Zeyu Qin, Qingxiu Dong, Xingxing Zhang, Li Dong, Xiaolong Huang, Ziyi Yang, Mahmoud Khademi, Dongdong Zhang, Hany Hassan Awadalla, Yi R. Fung, Weizhu Chen, Minhao Cheng, Furu Wei. In COLM 2025.
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models Haoran Li*, Qingxiu Dong*, Zhengyang Tang*, Chaojun Wang*, Xingxing Zhang*, Haoyang Huang*, Shaohan Huang, Xiaolong Huang, Zeqiang Huang, Dongdong Zhang, Yuxian Gu, Xin Cheng, Xun Wang, Si-Qing Chen, Li Dong, Wei Lu, Zhifang Sui, Benyou Wang, Wai Lam, Furu Wei. In TMLR 2025.
[arXiv ] [bib ]
BitNet b1.58 2B4T Technical Report Shuming Ma, Hongyu Wang, Shaohan Huang, Xingxing Zhang, Ying Hu, Ting Song, Yan Xia, Furu Wei. In ArXiv 2025.
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale Jiaxi Li, Xingxing Zhang, Xun Wang, Xiaolong Huang, Li Dong, Liang Wang, Si-Qing Chen, Wei Lu, Furu Wei. In ArXiv 2025.
Preference Optimization for Reasoning with Pseudo Feedback Fangkai Jiao, Geyang Guo, Xingxing Zhang, Nancy F. Chen, Shafiq Joty, Furu Wei. In ICLR 2025.
Self-Boosting Large Language Models with Synthetic Preference Data Qingxiu Dong, Li Dong, Xingxing Zhang, Zhifang Sui, Furu Wei. In ICLR 2025.
RedStone: Curating general, code, math, and QA data for large language models Yaoyao Chang, Lei Cui, Li Dong, Shaohan Huang, Yangyu Huang, Yupan Huang, Scarlett Li, Tengchao Lv, Shuming Ma, Qinzheng Sun, Wenhui Wang, Furu Wei, Ying Xin, Mao Yang, Qiufeng Yin, Xingxing Zhang. In ArXiv 2024.
xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token Xin Cheng, Xun Wang, Xingxing Zhang, Tao Ge, Si-Qing Chen, Furu Wei, Huishuai Zhang, Dongyan Zhao. In NeurIPS 2024.
MathScale: Scaling Instruction Tuning for Mathematical Reasoning Zhengyang Tang, Xingxing Zhang, Benyou Wan, Furu Wei. In ICML 2024.
[MWPBench ]
LongNet: Scaling Transformers to 1,000,000,000 Tokens Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei. ArXiv 2023
Tuna: Instruction Tuning using Feedback from Large Language Models Haoran Li, Yiran Liu, Xingxing Zhang, Wei Lu, Furu Wei. In EMNLP 2023.
[arXiv ]
Momentum Calibration for Text Generation Xingxing Zhang*, Yiran Liu♣ *, Xun Wang, Pengcheng He, Yang Yu, Si-Qing Chen, Wayne Xiong, Furu Wei. ArXiv 2022
Latent Prompt Tuning for Text Summarization Yubo Zhang♣ , Xingxing Zhang, Xun Wang, Si-qing Chen, Furu Wei. ArXiv 2022
Unsupervised Multi-Granularity Summarization Ming Zhong, Yang Liu, Suyu Ge, Yuning Mao, Yizhu Jiao, Xingxing Zhang , Yichong Xu, Chenguang Zhu, Michael Zeng, Jiawei Han. In EMNLP 2022.
[arXiv ]
Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization Ruipeng Jia, Xingxing Zhang , Yanan Cao, Shi Wang, Zheng Lin, Furu Wei. In ACL 2022.
[arXiv ]
Attention Temperature Matters in Abstractive Summarization Distillation Shengqiang Zhang, Xingxing Zhang , Hangbo Bao, Furu Wei. In ACL 2022.
[arXiv ] [code ]
Sequence Level Contrastive Learning for Text Summarization Shusheng Xu, Xingxing Zhang , Yi Wu, Furu Wei. In AAAI 2022 (15% Acceptance Rate).
[arXiv ] [code ]
Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers Shusheng Xu, Xingxing Zhang , Yi Wu, Furu Wei and Ming Zhou. In Findings of EMNLP 2020.
[bib ] [arXiv ] [code ]
Pre-training for Abstractive Document Summarization by Reinstating Source Text Yanyan Zou, Xingxing Zhang , Wei Lu, Furu Wei and Ming Zhou. In EMNLP 2020.
[bib ] [arXiv ] [code ]
Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction Mengyun Chen, Tao Ge, Xingxing Zhang , Furu Wei and Ming Zhou. In EMNLP 2020.
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue Xiaoze Jiang, Jing Yu, Zengchang Qin, Yingying Zhuang, Xingxing Zhang , Yue Hu, Qi Wu. In AAAI 2020.
[arXiv ]
Document-Based Question Answering Improves Query-Focused Multi-document Summarization Weikang Li, Xingxing Zhang , Yunfang Wu, Furu Wei, Ming Zhou. In NLPCC 2019.
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization Xingxing Zhang , Furu Wei and Ming Zhou. In ACL 2019.
[bib ] [arXiv ] [slides ] [project page ]
Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study Tao Ge, Xingxing Zhang , Furu Wei and Ming Zhou. In ACL 2019.
[bib ]
Neural Latent Extractive Document Summarization Xingxing Zhang , Mirella Lapata, Furu Wei and Ming Zhou. In EMNLP 2018 (short paper ).
[bib ] [arXiv ] [resource ]
Natural Language Generation as Neural Sequence Learning and Beyond Xingxing Zhang . Ph.D Thesis, 2017.
[bib ] [code & data ] [slides ]
UParse: the Edinburgh system for the CoNLL 2017 UD shared task Clara Vania, Xingxing Zhang , Adam Lopez. In CoNLL 2017 Shared Task.
[bib ][arXiv]
Sentence Simplification with Deep Reinforcement Learning Xingxing Zhang and Mirella Lapata. In EMNLP 2017.
[bib ] [arXiv ] [code & data ] [slides ]
Dependency Parsing as Head Selection Xingxing Zhang , Jianpeng Cheng and Mirella Lapata. In EACL 2017.
[bib ] [arXiv ] [resource ] [code ] [slides ]
Top-down Tree Long Short-Term Memory Networks Xingxing Zhang , Liang Lu and Mirella Lapata. In NAACL 2016.
[bib ] [arXiv ] [code ] [slides ]
On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-End Speech Recognition Liang Lu, Xingxing Zhang and Steve Renals. In ICASSP 2016.
A Study of the Recurrent Neural Network Encoder-Decoder for Large Vocabulary Speech Recognition Liang Lu, Xingxing Zhang , Kyunghyun Cho and Steve Renals. In INTERSPEECH 2015.
Chinese Poetry Generation with Recurrent Neural Networks Xingxing Zhang and Mirella Lapata. In EMNLP 2014.
[bib ] [data ] [code ]
Software
Dress Simplification Model : DRESS : D eep RE inforcement S entence S implification Model (in Lua with Torch)
DeNSe Parser : A Neural Dependency Parser (in Lua with Torch)
TD-TreeLSTM : Top-down Tree Long Short-Term Memory Networks (in Lua with Torch)
RNNPG : A (Hierarchical) Recurrent Neural Network based Chinese Poetry Generator (in C++)
Professional Activities
Area Chair for ACL 2021
Area Chair for INLG 2021
Area Chair (Action Editor) for ACL/ARR 2022
Area Chair (Action Editor) for NAACL/ARR 2022
Area Chair for ACL 2023
Area Chair for EMNLP 2024
Last updated: Oct, 2024