Zhou Yuan-Yau Mathematical Sciences Center, Tsinghua University

Research Areas

Machine learning, operations research & management, theoretical computer science

Education

2009 – 2014 Ph.D. in Computer Science, Carnegie Mellon University

2005 – 2009 B.Eng. in Computer Science, Tsinghua University

Work Experience

2021 – present Associate Professor, Yau Mathematical Sciences Center & Department of Mathematical Sciences, Tsinghua University

2019 – 2021 Assistant Professor, Department of Industrial and Enterprise Systems Engineering, University of Illinois Urbana-Champaign

2016 – 2019 Assistant Professor, Computer Science Department, Indiana University at Bloomington

2014 – 2016 Instructor in Applied Mathematics, Mathematics Department, Massachusetts Institute of Technology

Professional Services

2022 – present Associate Editor, Operations Research Letters

2024 – present Associate Editor, Operations Research

Selected Publications

Note: in papers related to theoretical computer science, theoretical machine learning, and operations research & management, authors are usually listed in alphabetical order; ^#: equal advising.

Optimality of Sample Average Approximation for Data-Driven Newsvendor Problems: A General Optimization Perspective, Jiameng Lyu, Shilin Yuan, Bingkun Zhou, Yuan Zhou, Production and Operations Management, to appear
Learning in Lost-Sales Inventory Systems with Stochastic Lead Times and Random Supplies, Xin Chen, Jiameng Lyu, Shilin Yuan, Yuan Zhou, Management Science, to appear
Technical Note—Fairness-aware Online Price Discrimination with Nonparametric Demand Models, Xi Chen, Jiameng Lyu, Xuan Zhang, Yuan Zhou, Operations Research 74(1), pp. 118–129, (2026)
A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints, Xi Chen, Mo Liu, Yining Wang, Yuan Zhou, Production and Operations Management (2025)
A neural symbolic model for space physics, Jie Ying, Haowei Lin, Chao Yue, Yajie Chen, Chao Xiao, Quanqi Shi, Yitao Liang, Shing-Tung Yau^#, Yuan Zhou^#, Jianzhu Ma^#, Nature Machine Intelligence (2025)
Safety-Polarized and Prioritized Reinforcement Learning, Ke Fan, Jinpeng Zhang, Xuefeng Zhang, Yunze Wu, Jingyu Cao, Yuan Zhou^#, Jianzhu Ma^#, ICML 2025
Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits, Zihan Zhang, Xiangyang Ji, Yuan Zhou, ICLR 2025
A data-driven group retrosynthesis planning model inspired by neurosymbolic programming, Xuefeng Zhang, Haowei Lin, Muhan Zhang, Yuan Zhou^#, Jianzhu Ma^#, Nature Communications 16, 192 (2025)
Bayesian Mechanism Design for Blockchain Transaction Fee Allocation, Xi Chen, David Simichi-Levi, Zishuo Zhao, Yuan Zhou, Operations Research 73(4), pp. 1944–1964 (2025)
A Minibatch-SGD-based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy, Jiameng Lyu, Jinxing Xie, Shilin Yuan, Yuan Zhou, Management Science 71(7), pp. 5572–5588 (2024)
Optimal Policies for Dynamic Pricing and Inventory Control with Nonparametric Censored Demands, Boxiao Chen, Yining Wang, Yuan Zhou, Management Science, 70(5), pp. 3362–3380 (2024)
Network Revenue Management with Demand Learning and Fair Resource-Consumption Balancing, Xi Chen, Jiameng Lyu, Yining Wang, Yuan Zhou, Production and Operations Management 33(2), pp. 494–511 (2024)
Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits, Yingkai Li, Yining Wang, Yuan Zhou, IEEE Transactions on Information Theory 70(1), pp. 372–388 (2024); preliminary version appeared in COLT 2019
Robust Situational Reinforcement Learning in face of Context Disturbances, Jinpeng Zhang, Yufeng Zheng, Chuheng Zhang, Li Zhao, Lei Song, Yuan Zhou, Jiang Bian, ICML 2023
Learning Sparse Group Models Through Boolean Relaxation, Yijie Wang, Yuan Zhou, Xiaoqing Huang, Kun Huang, Jie Zhang, Jianzhu Ma, ICLR 2023
Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information, Boxiao Chen, David Simchi-Levi, Yining Wang, Yuan Zhou, Management Science, 68(8), pp. 5684–5703 (2022)
Near-optimal Regret Bounds for Multi-batch Reinforcement Learning, Zihan Zhang, Yuhang Jiang, Yuan Zhou, Xiangyang Ji, NeurIPS 2022
Off-policy Reinforcement Learning with Delayed Rewards, Beining Han, Zhizhou Ren, Zuofan Wu, Yuan Zhou, Jian Peng, ICML 2022
Proximal Exploration for Model-guided Protein Sequence Design, Zhizhou Ren, Jiahan Li, Fan Ding, Yuan Zhou, Jianzhu Ma, Jian Peng, ICML 2022
Learning Long-term Reward Redistribution via Randomized Return Decomposition, Zhizhou Ren, Ruihan Guo, Yuan Zhou, Jian Peng, ICLR 2022
Imitation Learning from Observations under Transition Model Disparity, Tanmay Gangwani, Yuan Zhou, Jian Peng, ICLR 2022
Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design, Yufei Ruan, Jiaqi Yang, Yuan Zhou, STOC 2021
Model-free Reinforcement Learning: from Clipped Pseudo-regret to Sample Complexity, Zihan Zhang, Yuan Zhou, Xiangyang Ji, ICML 2021
Optimal Policy for Dynamic Assortment Planning under Multinomial Logit Models, Xi Chen, Yining Wang, Yuan Zhou, Mathematics of Operations Research, 46–4, pp. 1639–1657 (2021)
Dynamic Assortment Planning under Nested Logit Models, Xi Chen, Chao Shi, Yining Wang, Yuan Zhou, Production and Operations Management, 30–1, pp. 85–102 (January 2021)
Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition, Zihan Zhang, Yuan Zhou, Xiangyang Ji, NeurIPS 2020
Learning Guidance Rewards with Trajectory-space Smoothing, Tanmay Gangwani, Yuan Zhou, Jian Peng, NeurIPS 2020
Dynamic Assortment Optimization with Changing Contextual Information, Xi Chen, Yining Wang, Yuan Zhou, Journal of Machine Learning Research, 21(216), pp. 1–44 (2020)
Collaborative Top Distribution Identifications with Limited Interaction, Nikolai Karpov, Qin Zhang, Yuan Zhou, FOCS 2020
Multinomial Logit Bandit with Low Switching Cost, Kefan Dong, Yingkai Li, Qin Zhang, Yuan Zhou, ICML 2020
Root-n-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank, Kefan Dong, Jian Peng, Yining Wang, Yuan Zhou, COLT 2020
Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits, Chao Tao, Qin Zhang, Yuan Zhou, FOCS 2019
Exploration via Hindsight Goal Generation, Zhizhou Ren, Kefan Dong, Yuan Zhou, Qiang Liu, Jian Peng, NeurIPS 2019
Thresholding Bandit with Optimal Aggregate Regret, Chao Tao, Saúl Blanco, Jian Peng, and Yuan Zhou, NeurIPS 2019
Optimal Design of Process Flexibility for General Production Systems, Xi Chen, Tengyu Ma, Jiawei Zhang, Yuan Zhou, Operations Research 67–2, pp. 516–531 (2019)
Off-policy Evaluation and Learning from Logged Bandit Feedback: Error reduction via surrogate policy, Yuan Xie, Boyi Liu, Qiang Liu, Zhaoran Wang, Yuan Zhou, Jian Peng, ICLR 2019
Near-optimal Policies for Dynamic Multinomial Logit Assortment Selection Models, Yining Wang, Xi Chen, Yuan Zhou, NeurIPS 2018
Tight Bounds for Collaborative PAC Learning via Multiplicative Weights, Jiecao Chen, Qin Zhang, Yuan Zhou, NeurIPS 2018
Best Arm Identification in Linear Bandits with Linear Dimension Dependency, Chao Tao, Saúl Blanco, and Yuan Zhou, ICML 2018
Adaptive Multiple-arm Identification, Jiecao Chen, Xi Chen, Qin Zhang, Yuan Zhou, ICML 2017
Parameterized Algorithms for Constraint Satisfaction Problems Above Average with Global Cardinality Constraints, Xue Chen, Yuan Zhou, SODA 2017
Satisfiability of Ordering CSPs Above Average Is Fixed-Parameter Tractable, Konstantin Makarychev, Yury Makarychev, Yuan Zhou, FOCS 2015
Optimal Sparse Designs for Process Flexibility via Probabilistic Expanders, Xi Chen, Jiawei Zhang, Yuan Zhou, Operations Research 63–5, pp. 1159–1176 (2015)
Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing, Yuan Zhou, Xi Chen, Jian Li, ICML 2014
Constant Factor Lasserre Gaps for Graph Partitioning Problems, Venkatesan Guruswami, Ali Kemal Sinop, Yuan Zhou, SIAM Journal on Optimization 24–4, pp. 1698–1717 (2014)
Hardness of Robust Graph Isomorphism, Lasserre Gaps, and Asymmetry of Random Graphs, Ryan O’Donnell, John Wright, Chenggang Wu, Yuan Zhou, SODA 2014
Hypercontractive inequalities via SOS, with an application to Vertex-Cover, Manuel Kauers, Ryan O’Donnell, Li-Yang Tan, Yuan Zhou, SODA 2014
Approximability and proof complexity, Ryan O’Donnell, Yuan Zhou, SODA 2013
Hypercontractivity, Sum-of-Squares Proofs, and their Applications, Boaz Barak, Fernando Brandao, Aram Harrow, Jonathan Kelner, David Steurer, Yuan Zhou, STOC 2012
Polynomial integrality gaps for strong SDP relaxations of Densest k-Subgraph, Aditya Bhaskara, Moses Charikar, Venkatesan Guruswami, Aravindan Vijayaraghavan, Yuan Zhou, SODA 2012
Approximation Algorithms and Hardness of the k-Route Cut Problem, Julia Chuzhoy, Yury Makarychev, Aravindan Vijayaraghavan, Yuan Zhou, SODA 2012
Tight Inapproximability Bounds for Almost-satisfiable Horn SAT and Exact Hitting Set, Venkatesan Guruswami, Yuan Zhou, SODA 2011

Zhou YuanAssociate Professor