![]() |
|
I am a Senior Staff Algorithm Engineer(资深算法专家) in KuaiShou Technology. I am responsible for business optimization and technical management.
My core research interest is reinforcement learning, large language models, and recommendation.
I lead the application of reinforcement learning to recommendation and advertising(papers and codebase).
More information can be found in Google Scholar, DBLP.
I am hiring students passionate about RL and LLM. If interested, please feel free to contact me. Email:cqpcurry [@] gmail [DOT] com
First Prize of the General track at the NeurIPS 2024 Competition: Auto-Bidding in Large-Scale Auctions [News Link]
First Prize of the AIGB track at the NeurIPS 2024 Competition: Auto-Bidding in Large-Scale Auctions [News Link]
2024年“钱伟长中文信息处理科学技术奖”自然科学类一等奖[News Link]
Reinforcement Learning for Short Video Recommender Systems
[pdf]
The 1st Workshop on
LARGE-SCALE VIDEO RECOMMENDER SYSTEMS@ACM RecSys'23
Reinforcement Learning for Industrial Recommender Systems
[pdf]
DRL4IR@SIGIR2022
Agent-based Information Retrieval Workshop @SIGIR 2024, SIGIR 2025
International Conference on Autonomous Agents and Multi-Agent Systems(AAMAS) 2025
TMLR, NeurIPS, ICLR, ICML, KDD, WWW, CIKM, IJCAI, AAAI
LLM-Powered Efficient User Simulator for Recommender System
[pdf]
Zijian Zhang, Shuchang Liu, Ziru Liu, Rui Zhong, Qingpeng Cai*, Xiangyu Zhao, Chunxu Zhang, Qidong Liu, Peng Jiang
AAAI-2025, Oral
DLCRec: A Novel Approach for Managing Diversity in LLM-Based Recommender Systems
[pdf]
Jiaju Chen, Chongming Gao, Shuai Yuan, Shuchang Liu, Qingpeng Cai, Peng Jiang
WSDM-2025
Flow Factorization for Efficient Generative Flow Networks
[pdf]
Jiashun Liu, Chunhui Li, Cheng-Hao Liu, Dianbo Liu, Qingpeng Cai, Ling Pan
AAAI-2025, Oral
State Regularized Policy Optimization on Data with Dynamics Shift
[pdf]
Zhenghai Xue, Qingpeng Cai, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, Bo An
NeurIPS-2023
Exploration in policy optimization through multiple paths
Ling Pan, Qingpeng Cai, Longbo Huang
JAAMAS-2021
Softmax Deep Double Deterministic Policy Gradients
[pdf]
Ling Pan, Qingpeng Cai, Longbo Huang
NeurIPS-2020
Reinforcement Learning with Dynamic Boltzmann Softmax Updates
[pdf]
Ling Pan, Qingpeng Cai, Qi Meng, Wei Chen, Longbo Huang
IJCAI-2020
Multi-path Policy Optimization
[pdf]
Ling Pan, Qingpeng Cai, Longbo Huang
AAMAS-2020
(Invited for fast-track publication in JAAMAS, top 5%)
Deterministic Value-Policy Gradients
[pdf]
Qingpeng Cai, Ling Pan, Pingzhong Tang
AAAI-2020
Policy optimization with model-based explorations
Feiyang Pan, Qingpeng Cai, An-Xiang Zeng, Chun-Xiang Pan, Qing Da, Hualin He, Qing He, Pingzhong Tang
AAAI-2019
A Deep Reinforcement Learning Framework for Rebalancing Dockless Bike Sharing Systems
[pdf]
Ling Pan, Qingpeng Cai, Zhixuan Fang, Pingzhong Tang, Longbo Huang
AAAI-2019
Reinforcement Learning Driven Heuristic Optimization
[pdf]
Qingpeng Cai, Will Hang, Azalia Mirhoseini, George Tucker, Jingtao Wang, Wei Wei
DRL4KDD-2019
Policy gradients for contextual recommendations
Feiyang Pan, Qingpeng Cai, Pingzhong Tang, Fuzhen Zhuang, Qing He
WWW-2019
GAS: Generative Auto-bidding with Post-training Search
[pdf]
Yewen Li, Shuai Mao, Jingtong Gao, Nan Jiang, Yunjian Xu, Qingpeng Cai*, Fei Pan, Peng Jiang, Bo An
WWW-2025, Industry Track
AURO: Reinforcement Learning for Adaptive User Retention Optimization in Recommender Systems
Zhenghai Xue, Qingpeng Cai*, Tianyou Zuo, Bin Yang, Lantao Hu, Peng Jiang, Kun Gai, Bo An
WWW-2025
Value Function Decomposition in Markov Recommendation Process
Xiaobei Wang, Shuchang Liu, Qingpeng Cai, Xiang Li, Lantao Hu, Han Li, Guangming Xie
WWW-2025
Modeling User Retention through Generative Flow Networks
[pdf]
Ziru Liu, Shuchang Liu, Bin Yang, Zhenghai Xue, Qingpeng Cai*, Xiangyu Zhao, Zijian Zhang, Lantao Hu, Han Li, Peng Jiang
KDD-2024, industry track
Future Impact Decomposition in Request-level Recommendations
[pdf]
Xiaobei Wang, Shuchang Liu, Xueliang Wang, Qingpeng Cai*, Lantao Hu, Han Li, Peng Jiang, Guangming Xie
KDD-2024, industry track
Sequential Recommendation for Optimizing Both Immediate Feedback and Long-term Retention
[pdf]
Ziru Liu, Shuchang Liu, Zijian Zhang, Qingpeng Cai*, Xiangyu Zhao, Kesen Zhao, Lantao Hu, Peng Jiang, Kun Gai
SIGIR-2024
M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework
[pdf]
Zijian Zhang, Shuchang Liu, Jiaao Yu, Qingpeng Cai, Xiangyu Zhao, Chunxu Zhang, Ziru Liu, Qidong Liu, Hongwei Zhao, Lantao Hu, Peng Jiang, Kun Gai
SIGIR-2024
KuaiSim: A Comprehensive Simulator for Recommender Systems
[pdf]
Kesen Zhao, Shuchang Liu, Qingpeng Cai*, Xiangyu Zhao*, Ziru Liu, Dong Zheng, Peng Jiang, Kun Gai
NeurIPS-2023
ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
[pdf]
Wanqi Xue, Qingpeng Cai, Ruohan Zhan, Dong Zheng, Peng Jiang, Kun Gai, Bo An
ICLR-2023
PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement
[pdf]
Wanqi Xue, Qingpeng Cai, Zhenghai Xue, Shuo Sun, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, Bo An
KDD-2023
Generative Flow Network for Listwise Recommendation
[pdf]
Shuchang Liu, Qingpeng Cai, Zhankui He, Bowen Sun, Julian McAuley, Dong Zheng, Peng Jiang, Kun Gai
KDD-2023
Multi-Task Recommendations with Reinforcement Learning
[pdf]
Ziru Liu, Jiejie Tian, Qingpeng Cai*, Xiangyu Zhao*, Jingtong Gao, Shuchang Liu, Dayou Chen, Tonghao He, Dong Zheng, Peng Jiang and Kun Gai
WWW-2023
Exploration and Regularization of the Latent Action Space in Recommendation
[pdf]
Shuchang Liu, Qingpeng Cai, Bowen Sun, Yuhao Wang, Dong Zheng, Peng Jiang, Kun Gai, Ji Jiang, Xiangyu Zhao and Yongfeng Zhang
WWW-2023
Two-Stage Constrained Actor-Critic for Short Video Recommendation
[pdf]
Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, Peng Jiang and Kun Gai
WWW-2023
Reinforcing User Retention in a Billion Scale Short Video Recommender System
[pdf]
Qingpeng Cai, Shuchang Liu, Xueliang Wang, Tianyou Zuo, Wentao Xie, Bin Yang, Dong Zheng, Peng Jiang and Kun Gai
WWW-2023, industry track
[news link]
Reinforcement Mechanism Design for E-commerce
[pdf]
Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, Yiwei Zhang
WWW-2018
Reinforcement Mechanism Design for Fraudulent Behaviour in E-commerce
[pdf]
Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, Yiwei Zhang
AAAI-2018
Ranking Mechanism Design for Price-setting Agents in E-commerce
[pdf]
Qingpeng Cai, Pingzhong Tang, Yulong Zeng
AAMAS-2018
Multi-armed Bandit Mechanism With Private Histories
[pdf]
Chang Liu, Qingpeng Cai, Yukui Zhang
AAMAS-2017 (Extended abstract)
Facility location with Minimax Envy
[pdf]
Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang
IJCAI-2016
Mechanism Design for Personalized Recommender Systems
[pdf]
Qingpeng Cai, Aris Filos-Ratsikas, Chang Liu, Pingzhong Tang
Recsys-2016