[강연] Toward a Tractable Solution for Human-in-the-loop Reinforcement Learning: Algorithm and Benchmark (Kimin Lee, 서울대 AI여름학교 2021) :: AI & Medicine

[강연] Toward a Tractable Solution for Human-in-the-loop Reinforcement Learning: Algorithm and Benchmark (Kimin Lee, 서울대 AI여름학교 2021)

2021. 9. 25. 20:27

https://www.youtube.com/watch?v=lL-nq8zhi18

Reward engineering 문제와 reward exploitation문제를 해결하고자 human perference가 반영된 reward를 sequential query로부터 학습
Human-in-the-loop RL

관련 papers:

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training (ICML 2021)
- Paper link: https://arxiv.org/abs/2106.05091
- Site: https://sites.google.com/view/icml21pebble
- Code: https://github.com/pokaxpoka/B_Pref
B-Pref: Benchmark for Preference-based RL (NeurIPS 2021, Track)
- Openreview link: https://openreview.net/forum?id=ps95-mkHF_ d
- Code: https://github.com/pokaxpoka/B_Pref

'AI & RL > Human-in-the-Loop RL' 카테고리의 다른 글

[북마크] Recursively Summarizing Books with Human Feedback (Jeff Wu, ArXiv 2021) (0)	2021.09.25

+ Recent posts

Powered by Tistory, Designed by wallel

티스토리툴바