I am a first-year Ph.D. student at Electrical and Computer Engineering Department, Princeton University, advised by Prof. Chi Jin.
Previously, I did my undergraduate at Yuanpei College, Peking University.
I am interested in the intersection of RL and LLMs, especially on certifiable reasoning.
I view exploration as the core challenge in RL, and test-time search being necessary to achieve it for LLM agents.
The key technical problem is how search procedures and expert decision-making systems can be internalized as reasoning ability, rather than remaining external scaffolding.
I see two tightly coupled aspects: backfilling expert search behavior into the model through learning, and on-the-fly calibration that lets the model assess uncertainty and decide when to search, explore, or trust its own prediction.
Publications
LeAct: Learning to Reason from Expert Actions Ziran Yang,
Chengshuai Shi,
Raj Ghugare,
Benjamin Eysenbach,
Karthik Narasimhan,
Chi Jin Under review Distilling certified expert action systems (game solvers, classical planners, theorem provers) into LLM chain-of-thought.
Goedel-Architect: Streamlining Formal Theorem Proving with Blueprint Generation and Refinement Jui-Hui Chung,
Ziyang Cai,
Zihao Li,
Qishuo Yin,
Rohit Agarwal,
Simon Park,
Rodrigo Porto,
Narutatsu Ri,
Ziran Yang,
Shange Tang,
Xingyu Dang,
Hongzhou Lin,
Mengdi Wang,
Danqi Chen,
Chi Jin,
Liam H Fowl,
Sanjeev Arora AI4MATH@ICML 2026; Arxiv A blueprint-generation and refinement pipeline for efficient Lean theorem proving.