研究领域

机器学习理论、运筹管理、理论计算机科学


教育背景

2009 2014          计算机科学博士,卡内基梅隆大学

2005 2009          计算机工程学士,清华大学


工作经历

2021 至今          副教授,清华大学丘成桐数学科学中心和数学科学系

2019 2021          助理教授,伊利诺伊大学香槟厄巴纳分校工业与企业系统工程系

2016 2019          助理教授,印第安纳大学计算机科学系

2014 2016          应用数学讲师,麻省理工学院数学系


学术服务

2022  至今          Associate Editor, Operations Research Letters

2024  至今          Associate Editor, Operations Research


部分发表论文

  1. Bayesian Mechanism Design for Blockchain Transaction Fee Allocation, Xi Chen, David Simichi-Levi, Zishuo Zhao, Yuan Zhou, Operations Research, to appear

  2. Fairness-aware Online Price Discrimination with Nonparametric Demand Models, Xi Chen, Jiameng Lyu, Xuan Zhang, Yuan Zhou, Operations Research, to appear

  3. A Minibatch-SGD-based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy, Jiameng Lyu, Jinxing Xie, Shilin Yuan, Yuan Zhou, Management Science, 2024

  4. Optimal Policies for Dynamic Pricing and Inventory Control with Nonparametric Censored Demands, Boxiao Chen, Yining Wang, Yuan Zhou, Management Science, 70(5), pp. 3362–3380 (2024)

  5.  Network Revenue Management with Demand Learning and Fair Resource-Consumption Balancing, Xi Chen, Jiameng Lyu, Yining Wang, Yuan Zhou, Production and Operations Management 33(2), pp. 494511 (2024)

  6. Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits, Yingkai Li, Yining Wang, Yuan Zhou, IEEE Transactions on Information Theory 70(1), pp. 372388 (2024); prelimineary version appeared in COLT 2019

  7. Robust Situational Reinforcement Learning in face of Context Disturbances, Jinpeng Zhang, Yufeng Zheng, Chuheng Zhang, Li Zhao, Lei Song, Yuan Zhou, Jiang Bian, ICML 2023

  8. Learning Sparse Group Models Through Boolean Relaxation, Yijie Wang, Yuan Zhou, Xiaoqing Huang, Kun Huang, Jie Zhang, Jianzhu Ma, ICLR 2023

  9. Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information, Boxiao Chen, David Simchi-Levi, Yining Wang, Yuan Zhou, Management Science, 68(8), pp. 5684–5703 (2022)

  10. Near-optimal Regret Bounds for Multi-batch Reinforcement Learning, Zihan Zhang, Yuhang Jiang, Yuan Zhou, Xiangyang Ji, NeurIPS 2022

  11. Off-policy Reinforcement Learning with Delayed Rewards, Beining Han, Zhizhou Ren, Zuofan Wu, Yuan Zhou,  Jian Peng, ICML 2022

  12. Proximal Exploration for Model-guided Protein Sequence Design, Zhizhou Ren, Jiahan Li, Fan Ding, Yuan Zhou, Jianzhu Ma, Jian Peng, ICML 2022

  13. Learning Long-term Reward Redistribution via Randomized Return Decomposition, Zhizhou Ren, Ruihan Guo, Yuan Zhou, Jian Peng, ICLR 2022

  14. Imitation Learning from Observations under Transition Model Disparity, Tanmay Gangwani, Yuan Zhou, Jian Peng, ICLR 2022

  15. Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design, Yufei Ruan, Jiaqi Yang, Yuan Zhou, STOC 2021

  16. Model-free Reinforcement Learning: from Clipped Pseudo-regret to Sample Complexity, Zihan Zhang, Yuan Zhou, Xiangyang Ji, ICML 2021

  17. Optimal Policy for Dynamic Assortment Planning under Multinomial Logit Models, Xi Chen, Yining Wang, Yuan Zhou, Mathematics of Operations Research, 46–4, pp. 1639–1657 (2021)

  18. Dynamic Assortment Planning under Nested Logit Models, Xi Chen, Chao Shi, Yining Wang, Yuan Zhou, Production and Operations Management, 30–1, pp. 85–102 (January 2021)

  19. Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition, Zihan Zhang, Yuan Zhou, Xiangyang Ji, NeurIPS 2020

  20. Learning Guidance Rewards with Trajectory-space Smoothing, Tanmay Gangwani, Yuan Zhou, Jian Peng, NeurIPS 2020

  21. Dynamic Assortment Optimization with Changing Contextual Information, Xi Chen, Yining Wang, Yuan Zhou, Journal of Machine Learning Research, 21(216), pp. 1–44 (2020)

  22. Collaborative Top Distribution Identifications with Limited Interaction, Nikolai Karpov, Qin Zhang, Yuan Zhou, FOCS 2020

  23. Multinomial Logit Bandit with Low Switching Cost, Kefan Dong, Yingkai Li, Qin Zhang, Yuan Zhou, ICML 2020

  24. Root-n-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank, Kefan Dong, Jian Peng, Yining Wang, Yuan Zhou, COLT 2020

  25. Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits, Chao Tao, Qin Zhang, Yuan Zhou, FOCS 2019

  26. Exploration via Hindsight Goal Generation, Zhizhou Ren, Kefan Dong, Yuan Zhou, Qiang Liu, Jian Peng, NeurIPS 2019

  27. Thresholding Bandit with Optimal Aggregate Regret, Chao Tao, Saúl Blanco, Jian Peng, and Yuan Zhou, NeurIPS 2019

  28. Optimal Design of Process Flexibility for General Production Systems, Xi Chen, Tengyu Ma, Jiawei Zhang, Yuan Zhou, Operations Research 67–2, pp. 516–531 (2019)

  29. Off-policy Evaluation and Learning from Logged Bandit Feedback: Error reduction via surrogate policy, Yuan Xie, Boyi Liu, Qiang Liu, Zhaoran Wang, Yuan Zhou, Jian Peng, ICLR 2019

  30. Near-optimal Policies for Dynamic Multinomial Logit Assortment Selection Models, Yining Wang, Xi Chen, Yuan Zhou, NeurIPS 2018

  31. Tight Bounds for Collaborative PAC Learning via Multiplicative Weights, Jiecao Chen, Qin Zhang, Yuan Zhou, NeurIPS 2018

  32. Best Arm Identification in Linear Bandits with Linear Dimension Dependency, Chao Tao, Saúl Blanco, and Yuan Zhou, ICML 2018

  33. Adaptive Multiple-arm Identification, Jiecao Chen, Xi Chen, Qin Zhang, Yuan Zhou, ICML 2017

  34. Parameterized Algorithms for Constraint Satisfaction Problems Above Average with Global Cardinality Constraints, Xue Chen, Yuan Zhou, SODA 2017

  35. Satisfiability of Ordering CSPs Above Average Is Fixed-Parameter Tractable, Konstantin Makarychev, Yury Makarychev, Yuan Zhou, FOCS 2015

  36. Optimal Sparse Designs for Process Flexibility via Probabilistic Expanders, Xi Chen, Jiawei Zhang, Yuan Zhou, Operations Research 63–5, pp. 1159–1176 (2015)

  37. Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing, Yuan Zhou, Xi Chen, Jian Li, ICML 2014

  38. Constant Factor Lasserre Gaps for Graph Partitioning Problems, Venkatesan Guruswami, Ali Kemal Sinop, Yuan Zhou, SIAM Journal on Optimization 24–4, pp. 1698–1717 (2014)

  39. Hardness of Robust Graph Isomorphism, Lasserre Gaps, and Asymmetry of Random Graphs, Ryan O’Donnell, John Wright, Chenggang Wu, Yuan Zhou, SODA 2014

  40. Hypercontractive inequalities via SOS, with an application to Vertex-Cover, Manuel Kauers, Ryan O’Donnell, Li-Yang Tan, Yuan Zhou, SODA 2014

  41. Approximability and proof complexity, Ryan O’Donnell, Yuan Zhou, SODA 2013

  42. Hypercontractivity, Sum-of-Squares Proofs, and their Applications, Boaz Barak, Fernando Brandao, Aram Harrow, Jonathan Kelner, David Steurer, Yuan Zhou, STOC 2012

  43. Polynomial integrality gaps for strong SDP relaxations of Densest k-Subgraph, Aditya Bhaskara, Moses Charikar, Venkatesan Guruswami, Aravindan Vijayaraghavan, Yuan Zhou, SODA 2012

  44. Approximation Algorithms and Hardness of the k-Route Cut Problem, Julia Chuzhoy, Yury Makarychev, Aravindan Vijayaraghavan, Yuan Zhou, SODA 2012

  45. Tight Inapproximability Bounds for Almost-satisfiable Horn SAT and Exact Hitting Set, Venkatesan Guruswami, Yuan Zhou, SODA 2011