[요약] From Motor Control to Team Play in Simulated Humanoid Football (DeepMind, ArXiv 2021) 요약

2021. 5. 31. 12:09

Author : Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess
Paper Link : https://arxiv.org/abs/2105.12196v1

Video: https://youtu.be/KHMwq9pv7mg

기존의 RL알고리즘의 연구결과들을 보면 목표를 잘 달성하긴하지만 행동은 불안정한 경우가 대부분이라는 점과 좁은 범위의 행동만을 수행하는 뚜렷한 한계가 있었으며, 이는 RL이 다른 DL 알고리즘들과 달리 널리 사용되지 못하는 이유중 하나임
사람의 경우 밀리초 단위의 자세제어 뿐만아니라 수십 초 길이의 비교적 긴 목표를 동시에 가지고 행동을 하고있으며, 더 나아가 주변 사람 및 환경과의 상호작용까지 이루어져 자연스럽고 거시적인 행동이 가능
이 논문에선 기존의 단순한 학습목표에 따른 부자연스럽고 근시적이었던 RL의 한계를 극복하고자 서로다른 행동레벨을 복합적으로 학습시키는 방법을 보여줌
Environment

Learning Framework

Internal Representation

'AI & RL > Reinforcement Learning' 카테고리의 다른 글

[북마크] Secrets of RLHF in Large Language Models Part I: PPO (Rui Zheng, Arxiv 2023) (0)	2023.09.11
[요약] Decision Transformer: Reinforcement Learning via Sequence Modeling (Lili Chen, NeurIPS 2021) (0)	2021.06.04
[정리] Trajectory Transformer: Reinforcement Learning as One Big Sequence Modeling Problem (Michael Janner, NeurIPS 2021 spotlight) (0)	2021.06.04
[정리] Soft Actor-Critic (Haarnoja, 2018) (2)	2020.01.21
MuJoCo 설치 (윈도우, Linux) [22.6.25 수정] (5)	2020.01.17

AI & Medicine

[요약] From Motor Control to Team Play in Simulated Humanoid Football (DeepMind, ArXiv 2021) 요약

'AI & RL > Reinforcement Learning' 카테고리의 다른 글

+ Recent posts

티스토리툴바