Ziran Yang

I am a first-year Ph.D. student at Electrical and Computer Engineering Department, Princeton University, advised by Prof. Chi Jin. Previously, I did my undergraduate at Yuanpei College, Peking University. I am interested in the intersection of RL and LLMs, especially on certifiable reasoning.

Email  /  Google Scholar  /  Github

profile photo
Recent Interests

I view exploration as the core challenge in RL, and test-time search being necessary to achieve it for LLM agents. The key technical problem is how search procedures and expert decision-making systems can be internalized as reasoning ability, rather than remaining external scaffolding. I see two tightly coupled aspects: backfilling expert search behavior into the model through learning, and on-the-fly calibration that lets the model assess uncertainty and decide when to search, explore, or trust its own prediction.

Publications

Goedel-Code-Prover: Hierarchical Proof Search for Open State-of-the-Art Code Verification
Zenan Li*, Ziran Yang*, Deyuan He, Haoyu Zhao, Andrew Zhao, Shange Tang, Kaiyu Yang, Aarti Gupta, Zhendong Su, Chi Jin
Project / Paper / Model / Code / X


Goedel-Prover-V2: The Strongest Open-Source Theorem Prover to Date
Yong Lin*, Shange Tang*, Bohan Lyu*, Ziran Yang*, Jui-Hui Chung*, Haoyu Zhao*, Lai Jiang*, Yihan Geng*, Jiawei Ge, Jingruo Sun, Jiayun Wu, Jiri Gesi, David Acuna, Kaiyu Yang, Hongzhou Lin*, Yejin Choi, Danqi Chen, Sanjeev Arora, Chi Jin*,
AI4MATH@ICML 2025 (Oral), ICLR 2026


ALGOVERI: An Aligned Benchmark for Verified Code Generation on Classical Algorithms
Haoyu Zhao*, Ziran Yang*, Jiawei Li*, Deyuan He*, Zenan Li*, Chi Jin, Venugopal V. Veeravalli, Aarti Gupta, Sanjeev Arora
ICML 2026 (Spotlight); Arxiv


Understanding the Sources of Uncertainty for Large Language and Multimodal Models
Ziran Yang, Shibo Hao, Hao Sun, Lai Jiang, Qiyue Gao, Yian Ma, Zhiting Hu
ICLR 2025 Workshop: Quantify Uncertainty and Hallucination in Foundation Models


From Uncertainty to Trust: Enhancing Reliability in Vision-Language Models with Uncertainty-Guided Dropout Decoding
Yixiong Fang*, Ziran Yang*, Zhaorun Chen, Zhuokai Zhao, Jiawei Zhou
NeurIPS 2025


Evolving Diverse Red-team Language Models in Multi-round Multi-agent Games
Chengdong Ma*, Ziran Yang*, Hai Ci, Jun Gao, Minquan Gao, Xuehai Pan, Yaodong Yang
Arxiv


Panacea: Pareto Alignment via Preference Adaptation for LLMs
Yifan Zhong*, Chengdong Ma*, Xiaoyuan Zhang*, Ziran Yang, Qingfu Zhang, Siyuan Qi, Yaodong Yang,
NeurIPS 2024


SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset
Josef Dai, Tianle Chen, Xuyao Wang, Ziran Yang, Taiye Chen, Jiaming Ji, Yaodong Yang,
NeurIPS 2024 (DB Track)


Offline Reinforcement Learning for LLM Multi-Step Reasoning
Huaijie Wang, Shibo Hao, Hanze Dong, Shenao Zhang, Yilin Bao, Ziran Yang, Yi Wu
ACL 2025; ICLR 2025 Workshop: Reasoning and Planning for LLMs (Oral)




Experience


ByteDance Seed
2025.05 - 2025.08
Research Intern
Working on RL for Tool-using Agentic LLMs.
UC San Diego
2024.04 - 2024.11
Research Intern
Advisor: Prof. Zhiting Hu
PAIR Lab: PKU Alignment and Interaction Research Lab
2023.05 - Present
Research Intern
Advisor: Prof. Yaodong Yang
Services

  • Reviewer: NeurIPS 2024, ICLR 2025, AISTATS 2025, ICML 2025, NeurIPS 2025, AAAI 2026.

  • Selected Awards

  • 2024: Peking University Excellent Undergraduate Research Award
  • 2024: SenseTime Scholarship Nomination Award
  • 2024: Fifth Yuanpei Young Scholar Award
  • 2021: Peking University Freshman Scholarship
  • 2019: Ministry of Education Talent Program: annual Outstanding Thesis

  • This template is a modification to Jon Barron's website.