Wenzhen Zheng (郑文镇) | Academic Homepage

I am currently working on foundation model pretraining at StepFun. My research interests include large language model pretraining, scaling laws, training stability, muP, and Muon.

I received my M.S. from the Academy of Mathematics and Systems Science, Chinese Academy of Sciences, and my B.S. in Mathematics from Shandong University.

2026 ICLR 2026 Oral 🎉 "Can Mixture-of-Experts Surpass Dense LLMs Under Strictly Equal Resources?" accepted.
2026 ICLR 2026 🎉 "How Many Code and Test Cases Are Enough?" accepted.
2026 Release 🎉 Step-3.5 Flash released. Blog · GitHub · Tech report.
2025 NeurIPS 2025 Spotlight 🎉 "Farseer: A Refined Scaling Law in Large Language Models" spotlight.
2025 EMNLP 2025 🎉 "Debate-to-Detect: Reformulating Misinformation Detection as a Real-World Debate" accepted.
2025.06 Edu 🎉 Obtained M.S. from University of Chinese Academy of Sciences.
2025 AAAI 2025 Oral 🎉 "Beyond Detection: Exploring Evidence-based Multi-Agent Debate …" oral.
2024.12 Work 🎉 Joined StepFun (foundation model pretraining).
2024 EMNLP 2024 🎉 "Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale" accepted.

Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale

EMNLP 2024

W Zheng*, W Pan*, X Xu*, L Qin, L Yue, M Zhou
StepLaw — Optimal Hyperparameter Scaling Law in Large Language Model Pretraining

★ Representative Open-source

Houyi Li*, Wenzhen Zheng*, Qiufeng Wang*, Hanshan Zhang, Zili Wang, Shijie Xuyang, Yuantao Fan, Shuigeng Zhou, Xiangyu Zhang, Daxin Jiang

Project page · arXiv · Code
Farseer: A Refined Scaling Law in Large Language Models

NeurIPS 2025 Spotlight

Houyi Li*, Wenzhen Zheng*, Qiufeng Wang, Zhenyu Ding, Haoying Wang, Zili Wang, Shijie Xuyang, Ning Ding, Shuigeng Zhou, Xiangyu Zhang, Daxin Jiang

Project page · arXiv · Code
Scaling Laws for Code: A More Data-Hungry Regime

Preprint

Xianzhen Luo*, Wenzhen Zheng*, Qingfu Zhu, Rongyi Zhang, Houyi Li, Siming Huang, YuanTao Fan, Wanxiang Che
Debate-to-Detect: Reformulating Misinformation Detection as a Real-World Debate with Large Language Models

EMNLP 2025

C Han*, W Zheng*, X Tang
Beyond Detection: Exploring Evidence-based Multi-Agent Debate for Misinformation Intervention & Persuasion

AAAI 2025 Oral

C Han, Y Ma, J Tan, W Zheng, X Tang
How Many Code and Test Cases Are Enough? Evaluating Test Cases Generation from a Binary-Matrix Perspective

ICLR 2026

X Luo, J Huang, W Zheng, Q Zhu, M Xu, Y Xu, Y Fan, L Qin, W Che
Can Mixture-of-Experts Surpass Dense LLMs Under Strictly Equal Resources?

ICLR 2026 Oral

H Li, KM Lo, Z Wang, Z Wang, W Zheng, S Zhou, X Zhang, D Jiang
Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding

Model report Open-source

StepFun Team

arXiv:2507.19427
Step 3.5 Flash: Fast, Sharp & Reliable Agentic Intelligence

★ Representative Model report Open-source

StepFun Team

Tech report · GitHub · Blog
Simulating social network with LLM agents: An analysis of information propagation and echo chambers

KSS 2024 Oral

W Zheng, X Tang

2024.12 – Present

StepFun (阶跃星辰) · Foundation Model Pretraining

Advised by Xiangyu Zhang. Contributed to Step-3 and Step-3.5 Flash.

2024.05 – 2024.10

Meituan (美团) · LLM Pretraining Intern

2023.07 – 2024.03

Langboat Technology (澜舟科技) · Foundation Model Pretraining Intern

University of Chinese Academy of Sciences · M.S.

2022.09 – 2025.06

Academy of Mathematics and Systems Science, Chinese Academy of Sciences.

Shandong University · B.S. in Mathematics

2018.09 – 2022.06

MCM/ICM F Award (Finalist, top 1%)
National College Math Competition (Math Major Category A) Second Prize
Hua Luogeng Scholarship, CAS Institute of Mathematics
Shandong Province College Physics Competition First Prize
National High School Math League (Anhui Province) First Prize

About

News

Publications

Experience

Education

Awards & Honors