Research Areas

Machine learning theory, operations research & management, theoretical computer science


Education

2009 2014          Ph.D. in Computer Science, Carnegie Mellon University

2005 2009          B.Eng. in Computer Science, Tsinghua University


Work Experience

2021 present      Associate Professor, Yau Mathematical Sciences Center & Department of Mathematical Sciences, Tsinghua University

2019 2021          Assistant Professor, Department of Industrial and Enterprise Systems Engineering, University of Illinois Urbana-Champaign

2016 2019          Assistant Professor, Computer Science Department, Indiana University at Bloomington

2014 2016          Instructor in Applied Mathematics, Mathematics Department, Massachusetts Institute of Technology


Professional Services

2022 present       Associate Editor, Operations Research Letters

2024 present       Associate Editor, Operations Research


Selected Publications

   Note: in papers related to theoretical computer science, theoretical machine learning, and operations research & management, authors are usually listed in alphabetical order; #: equal advising.

  1. A data-driven group retrosynthesis planning model inspired by neurosymbolic programming, Xuefeng Zhang, Haowei Lin, Muhan Zhang, Yuan Zhou#, Jianzhu Ma#, Nature Communications 16, 192 (2025)

  2. Bayesian Mechanism Design for Blockchain Transaction Fee Allocation, Xi Chen, David Simichi-Levi, Zishuo Zhao, Yuan Zhou, Operations Research, to appear

  3. Fairness-aware Online Price Discrimination with Nonparametric Demand Models, Xi Chen, Jiameng Lyu, Xuan Zhang, Yuan Zhou, Operations Research, to appear

  4. A Minibatch-SGD-based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy, Jiameng Lyu, Jinxing Xie, Shilin Yuan, Yuan Zhou, Management Science, 2024

  5. Optimal Policies for Dynamic Pricing and Inventory Control with Nonparametric Censored Demands, Boxiao Chen, Yining Wang, Yuan Zhou, Management Science, 70(5), pp. 3362–3380 (2024)

  6.  Network Revenue Management with Demand Learning and Fair Resource-Consumption Balancing, Xi Chen, Jiameng Lyu, Yining Wang, Yuan Zhou, Production and Operations Management 33(2), pp. 494511 (2024)

  7. Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits, Yingkai Li, Yining Wang, Yuan Zhou, IEEE Transactions on Information Theory 70(1), pp. 372388 (2024); prelimineary version appeared in COLT 2019

  8. Robust Situational Reinforcement Learning in face of Context Disturbances, Jinpeng Zhang, Yufeng Zheng, Chuheng Zhang, Li Zhao, Lei Song, Yuan Zhou, Jiang Bian, ICML 2023

  9. Learning Sparse Group Models Through Boolean Relaxation, Yijie Wang, Yuan Zhou, Xiaoqing Huang, Kun Huang, Jie Zhang, Jianzhu Ma, ICLR 2023

  10. Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information, Boxiao Chen, David Simchi-Levi, Yining Wang, Yuan Zhou, Management Science, 68(8), pp. 5684–5703 (2022)

  11. Near-optimal Regret Bounds for Multi-batch Reinforcement Learning, Zihan Zhang, Yuhang Jiang, Yuan Zhou, Xiangyang Ji, NeurIPS 2022

  12. Off-policy Reinforcement Learning with Delayed Rewards, Beining Han, Zhizhou Ren, Zuofan Wu, Yuan Zhou,  Jian Peng, ICML 2022

  13. Proximal Exploration for Model-guided Protein Sequence Design, Zhizhou Ren, Jiahan Li, Fan Ding, Yuan Zhou, Jianzhu Ma, Jian Peng, ICML 2022

  14. Learning Long-term Reward Redistribution via Randomized Return Decomposition, Zhizhou Ren, Ruihan Guo, Yuan Zhou, Jian Peng, ICLR 2022

  15. Imitation Learning from Observations under Transition Model Disparity, Tanmay Gangwani, Yuan Zhou, Jian Peng, ICLR 2022

  16. Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design, Yufei Ruan, Jiaqi Yang, Yuan Zhou, STOC 2021

  17. Model-free Reinforcement Learning: from Clipped Pseudo-regret to Sample Complexity, Zihan Zhang, Yuan Zhou, Xiangyang Ji, ICML 2021

  18. Optimal Policy for Dynamic Assortment Planning under Multinomial Logit Models, Xi Chen, Yining Wang, Yuan Zhou, Mathematics of Operations Research, 46–4, pp. 1639–1657 (2021)

  19. Dynamic Assortment Planning under Nested Logit Models, Xi Chen, Chao Shi, Yining Wang, Yuan Zhou, Production and Operations Management, 30–1, pp. 85–102 (January 2021)

  20. Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition, Zihan Zhang, Yuan Zhou, Xiangyang Ji, NeurIPS 2020

  21. Learning Guidance Rewards with Trajectory-space Smoothing, Tanmay Gangwani, Yuan Zhou, Jian Peng, NeurIPS 2020

  22. Dynamic Assortment Optimization with Changing Contextual Information, Xi Chen, Yining Wang, Yuan Zhou, Journal of Machine Learning Research, 21(216), pp. 1–44 (2020)

  23. Collaborative Top Distribution Identifications with Limited Interaction, Nikolai Karpov, Qin Zhang, Yuan Zhou, FOCS 2020

  24. Multinomial Logit Bandit with Low Switching Cost, Kefan Dong, Yingkai Li, Qin Zhang, Yuan Zhou, ICML 2020

  25. Root-n-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank, Kefan Dong, Jian Peng, Yining Wang, Yuan Zhou, COLT 2020

  26. Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits, Chao Tao, Qin Zhang, Yuan Zhou, FOCS 2019

  27. Exploration via Hindsight Goal Generation, Zhizhou Ren, Kefan Dong, Yuan Zhou, Qiang Liu, Jian Peng, NeurIPS 2019

  28. Thresholding Bandit with Optimal Aggregate Regret, Chao Tao, Saúl Blanco, Jian Peng, and Yuan Zhou, NeurIPS 2019

  29. Optimal Design of Process Flexibility for General Production Systems, Xi Chen, Tengyu Ma, Jiawei Zhang, Yuan Zhou, Operations Research 67–2, pp. 516–531 (2019)

  30. Off-policy Evaluation and Learning from Logged Bandit Feedback: Error reduction via surrogate policy, Yuan Xie, Boyi Liu, Qiang Liu, Zhaoran Wang, Yuan Zhou, Jian Peng, ICLR 2019

  31. Near-optimal Policies for Dynamic Multinomial Logit Assortment Selection Models, Yining Wang, Xi Chen, Yuan Zhou, NeurIPS 2018

  32. Tight Bounds for Collaborative PAC Learning via Multiplicative Weights, Jiecao Chen, Qin Zhang, Yuan Zhou, NeurIPS 2018

  33. Best Arm Identification in Linear Bandits with Linear Dimension Dependency, Chao Tao, Saúl Blanco, and Yuan Zhou, ICML 2018

  34. Adaptive Multiple-arm Identification, Jiecao Chen, Xi Chen, Qin Zhang, Yuan Zhou, ICML 2017

  35. Parameterized Algorithms for Constraint Satisfaction Problems Above Average with Global Cardinality Constraints, Xue Chen, Yuan Zhou, SODA 2017

  36. Satisfiability of Ordering CSPs Above Average Is Fixed-Parameter Tractable, Konstantin Makarychev, Yury Makarychev, Yuan Zhou, FOCS 2015

  37. Optimal Sparse Designs for Process Flexibility via Probabilistic Expanders, Xi Chen, Jiawei Zhang, Yuan Zhou, Operations Research 63–5, pp. 1159–1176 (2015)

  38. Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing, Yuan Zhou, Xi Chen, Jian Li, ICML 2014

  39. Constant Factor Lasserre Gaps for Graph Partitioning Problems, Venkatesan Guruswami, Ali Kemal Sinop, Yuan Zhou, SIAM Journal on Optimization 24–4, pp. 1698–1717 (2014)

  40. Hardness of Robust Graph Isomorphism, Lasserre Gaps, and Asymmetry of Random Graphs, Ryan O’Donnell, John Wright, Chenggang Wu, Yuan Zhou, SODA 2014

  41. Hypercontractive inequalities via SOS, with an application to Vertex-Cover, Manuel Kauers, Ryan O’Donnell, Li-Yang Tan, Yuan Zhou, SODA 2014

  42. Approximability and proof complexity, Ryan O’Donnell, Yuan Zhou, SODA 2013

  43. Hypercontractivity, Sum-of-Squares Proofs, and their Applications, Boaz Barak, Fernando Brandao, Aram Harrow, Jonathan Kelner, David Steurer, Yuan Zhou, STOC 2012

  44. Polynomial integrality gaps for strong SDP relaxations of Densest k-Subgraph, Aditya Bhaskara, Moses Charikar, Venkatesan Guruswami, Aravindan Vijayaraghavan, Yuan Zhou, SODA 2012

  45. Approximation Algorithms and Hardness of the k-Route Cut Problem, Julia Chuzhoy, Yury Makarychev, Aravindan Vijayaraghavan, Yuan Zhou, SODA 2012

  46. Tight Inapproximability Bounds for Almost-satisfiable Horn SAT and Exact Hitting Set, Venkatesan Guruswami, Yuan Zhou, SODA 2011