[요약] User Response Models to Improve a REINFORCE RecommenderSystem (Minmin Chen, WSDM 2021)

2021. 10. 2. 16:23

Author : Minmin Chen, Bo Chang, Can Xu, Ed H. Chi
Paper Link : https://dl.acm.org/doi/10.1145/3437963.3441764

User Response Models to Improve a REINFORCE Recommender System | Proceedings of the 14th ACM International Conference on Web Sea

ABSTRACT Reinforcement Learning (RL) techniques have been sought after as the next-generation tools to further advance the field of recommendation research. Different from classic applications of RL, recommender agents, especially those deployed on commerc

dl.acm.org

Google의 지난 RL기반 추천 알고리즘 REINFORCE Recommender System (포스팅) 의 후속 논문
RL을 Recsys에 쓸 경우, RL이 다뤄온 일반적인 문제들에 비해 state와 action dimension이 굉장이 큰 반면 reward signal은 매우 드물에 할당되는 sample afficiency문제가 있음
이를 위해 auxiliary task로서 User response modeling을 하여 learning efficiency를 올림

실제 live service (언급은 없지만 전 논문과 `billions of users`를 보면 아마도 Youtube?) 에서 A/B테스트 수행
한 달간의 실험 결과 기존 baseline RL 알고리즘 대비 0.12% 성능이 증가 (비 활동적인 유저에선 0.26%) 한것을 확인

하지만 학습 window를 늘리면 오히려 성능이 떨어지는데서 유저들의 contents preference가 빠르게 변하는것이라 추측

개인적인 생각

마지막 추측은 추천시스템에서의 Sequential recommendation의 중요성이라 볼 수 있을것 같다.

'AI & RL > Recommender System' 카테고리의 다른 글

[정리] Know Your Action Set: Learning Action Relations for Reinforcement Learning (Ayush Jain, ICLR 2022) (2)	2022.04.26
[북마크] Top-K Off-Policy Correction for a REINFORCE Recommender System (Minmin Chen, WSDM 2019) (0)	2021.10.02
[북마크] Transformers4Rec: Bridging the Gap between NLP and Sequential / Session-Based Recommendation (RecSys 2021) (0)	2021.09.25
[요약] Towards Content Provider Aware Recommender Systems (Rouhan Zhan, WWW 2021) (0)	2021.05.09
[참고자료] Reinforcement Learning for Recommender Systerms (0)	2021.02.07

AI & Medicine

[요약] User Response Models to Improve a REINFORCE RecommenderSystem (Minmin Chen, WSDM 2021)

개인적인 생각

'AI & RL > Recommender System' 카테고리의 다른 글

+ Recent posts

티스토리툴바