Research Areas

Machine learning theory, operations research & management, theoretical computer science


Education

2009 2014          Ph.D. in Computer Science, Carnegie Mellon University

2005 2009          B.Eng. in Computer Science, Tsinghua University


Work Experience

2021 present      Associate Professor, Yau Mathematical Sciences Center & Department of Mathematical Sciences, Tsinghua University

2019 2021          Assistant Professor, Department of Industrial and Enterprise Systems Engineering, University of Illinois Urbana-Champaign

2016 2019          Assistant Professor, Computer Science Department, Indiana University at Bloomington

2014 2016          Instructor in Applied Mathematics, Mathematics Department, Massachusetts Institute of Technology


Professional Services

2022 present       Associate Editor, Operations Research Letters

2024 present       Associate Editor, Operations Research


Selected Publications

  1. Optimal Policies for Dynamic Pricing and Inventory Control with Nonparametric Censored Demands, Boxiao Chen, Yining Wang, Yuan Zhou, Management Science, 70(5), pp. 3362–3380 (2024)

  2.  Network Revenue Management with Demand Learning and Fair Resource-Consumption Balancing, Xi Chen, Jiameng Lyu, Yining Wang, Yuan Zhou, Production and Operations Management (2024)

  3. Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits, Yingkai Li, Yining Wang, Yuan Zhou, IEEE Transactions on Information Theory (2023); prelimineary version appeared in COLT 2019

  4. Robust Situational Reinforcement Learning in face of Context Disturbances, Jinpeng Zhang, Yufeng Zheng, Chuheng Zhang, Li Zhao, Lei Song, Yuan Zhou, Jiang Bian, ICML 2023

  5. Learning Sparse Group Models Through Boolean Relaxation, Yijie Wang, Yuan Zhou, Xiaoqing Huang, Kun Huang, Jie Zhang, Jianzhu Ma, ICLR 2023

  6. Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information, Boxiao Chen, David Simchi-Levi, Yining Wang, Yuan Zhou, Management Science, 68(8), pp. 5684–5703 (2022)

  7. Near-optimal Regret Bounds for Multi-batch Reinforcement Learning, Zihan Zhang, Yuhang Jiang, Yuan Zhou, Xiangyang Ji, NeurIPS 2022

  8. Off-policy Reinforcement Learning with Delayed Rewards, Beining Han, Zhizhou Ren, Zuofan Wu, Yuan Zhou,  Jian Peng, ICML 2022

  9. Proximal Exploration for Model-guided Protein Sequence Design, Zhizhou Ren, Jiahan Li, Fan Ding, Yuan Zhou, Jianzhu Ma, Jian Peng, ICML 2022

  10. Learning Long-term Reward Redistribution via Randomized Return Decomposition, Zhizhou Ren, Ruihan Guo, Yuan Zhou, Jian Peng, ICLR 2022

  11. Imitation Learning from Observations under Transition Model Disparity, Tanmay Gangwani, Yuan Zhou, Jian Peng, ICLR 2022

  12. Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design, Yufei Ruan, Jiaqi Yang, Yuan Zhou, STOC 2021

  13. Model-free Reinforcement Learning: from Clipped Pseudo-regret to Sample Complexity, Zihan Zhang, Yuan Zhou, Xiangyang Ji, ICML 2021

  14. Optimal Policy for Dynamic Assortment Planning under Multinomial Logit Models, Xi Chen, Yining Wang, Yuan Zhou, Mathematics of Operations Research, 46–4, pp. 1639–1657 (2021)

  15. Dynamic Assortment Planning under Nested Logit Models, Xi Chen, Chao Shi, Yining Wang, Yuan Zhou, Production and Operations Management, 30–1, pp. 85–102 (January 2021)

  16. Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition, Zihan Zhang, Yuan Zhou, Xiangyang Ji, NeurIPS 2020

  17. Learning Guidance Rewards with Trajectory-space Smoothing, Tanmay Gangwani, Yuan Zhou, Jian Peng, NeurIPS 2020

  18. Dynamic Assortment Optimization with Changing Contextual Information, Xi Chen, Yining Wang, Yuan Zhou, Journal of Machine Learning Research, 21(216), pp. 1–44 (2020)

  19. Collaborative Top Distribution Identifications with Limited Interaction, Nikolai Karpov, Qin Zhang, Yuan Zhou, FOCS 2020

  20. Multinomial Logit Bandit with Low Switching Cost, Kefan Dong, Yingkai Li, Qin Zhang, Yuan Zhou, ICML 2020

  21. Root-n-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank, Kefan Dong, Jian Peng, Yining Wang, Yuan Zhou, COLT 2020

  22. Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits, Chao Tao, Qin Zhang, Yuan Zhou, FOCS 2019

  23. Exploration via Hindsight Goal Generation, Zhizhou Ren, Kefan Dong, Yuan Zhou, Qiang Liu, Jian Peng, NeurIPS 2019

  24. Thresholding Bandit with Optimal Aggregate Regret, Chao Tao, Saúl Blanco, Jian Peng, and Yuan Zhou, NeurIPS 2019

  25. Optimal Design of Process Flexibility for General Production Systems, Xi Chen, Tengyu Ma, Jiawei Zhang, Yuan Zhou, Operations Research 67–2, pp. 516–531 (2019)

  26. Off-policy Evaluation and Learning from Logged Bandit Feedback: Error reduction via surrogate policy, Yuan Xie, Boyi Liu, Qiang Liu, Zhaoran Wang, Yuan Zhou, Jian Peng, ICLR 2019

  27. Near-optimal Policies for Dynamic Multinomial Logit Assortment Selection Models, Yining Wang, Xi Chen, Yuan Zhou, NeurIPS 2018

  28. Tight Bounds for Collaborative PAC Learning via Multiplicative Weights, Jiecao Chen, Qin Zhang, Yuan Zhou, NeurIPS 2018

  29. Best Arm Identification in Linear Bandits with Linear Dimension Dependency, Chao Tao, Saúl Blanco, and Yuan Zhou, ICML 2018

  30. Adaptive Multiple-arm Identification, Jiecao Chen, Xi Chen, Qin Zhang, Yuan Zhou, ICML 2017

  31. Parameterized Algorithms for Constraint Satisfaction Problems Above Average with Global Cardinality Constraints, Xue Chen, Yuan Zhou, SODA 2017

  32. Satisfiability of Ordering CSPs Above Average Is Fixed-Parameter Tractable, Konstantin Makarychev, Yury Makarychev, Yuan Zhou, FOCS 2015

  33. Optimal Sparse Designs for Process Flexibility via Probabilistic Expanders, Xi Chen, Jiawei Zhang, Yuan Zhou, Operations Research 63–5, pp. 1159–1176 (2015)

  34. Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing, Yuan Zhou, Xi Chen, Jian Li, ICML 2014

  35. Constant Factor Lasserre Gaps for Graph Partitioning Problems, Venkatesan Guruswami, Ali Kemal Sinop, Yuan Zhou, SIAM Journal on Optimization 24–4, pp. 1698–1717 (2014)

  36. Hardness of Robust Graph Isomorphism, Lasserre Gaps, and Asymmetry of Random Graphs, Ryan O’Donnell, John Wright, Chenggang Wu, Yuan Zhou, SODA 2014

  37. Hypercontractive inequalities via SOS, with an application to Vertex-Cover, Manuel Kauers, Ryan O’Donnell, Li-Yang Tan, Yuan Zhou, SODA 2014

  38. Approximability and proof complexity, Ryan O’Donnell, Yuan Zhou, SODA 2013

  39. Hypercontractivity, Sum-of-Squares Proofs, and their Applications, Boaz Barak, Fernando Brandao, Aram Harrow, Jonathan Kelner, David Steurer, Yuan Zhou, STOC 2012

  40. Polynomial integrality gaps for strong SDP relaxations of Densest k-Subgraph, Aditya Bhaskara, Moses Charikar, Venkatesan Guruswami, Aravindan Vijayaraghavan, Yuan Zhou, SODA 2012

  41. Approximation Algorithms and Hardness of the k-Route Cut Problem, Julia Chuzhoy, Yury Makarychev, Aravindan Vijayaraghavan, Yuan Zhou, SODA 2012

  42. Tight Inapproximability Bounds for Almost-satisfiable Horn SAT and Exact Hitting Set, Venkatesan Guruswami, Yuan Zhou, SODA 2011