ArtificialIntelligenceArticles

Now, this is something outstanding!😀
Paper-Title: Learning 3D Human Dynamics from Video
#UCB #CVPR_2019
Link to the paper: https://arxiv.org/pdf/1812.01601.pdf
Link to the Github: https://github.com/akanazawa/human_dynamics
Link to the Project page: https://akanazawa.github.io/human_dynamics/

TL;DR: They propose an end-to-end model that learns a model of 3D human dynamics that can 1) obtain smooth 3D prediction from video and 2) hallucinate 3D dynamics on single images at test time.

GitHub

GitHub - akanazawa/human_dynamics: Project for paper "Learning 3D Human Dynamics from Video"

Project for paper "Learning 3D Human Dynamics from Video" - akanazawa/human_dynamics

442 views04:24

ArtificialIntelligenceArticles

#weekend_read
Check out this mind-blowing survey on HRL with some really-strong proven hypothesis. Maybe the best till date!
Paper-Title: WHY DOES HIERARCHY (SOMETIMES) WORK SO WELL IN REINFORCEMENT LEARNING?
Link to the paper: https://arxiv.org/pdf/1909.10618.pdf
#GoogleAI #UCB
Four Hypothesis: The four hypotheses may also be categorized as hierarchical training (H1 and H3) and hierarchical exploration (H2 and H4).
(H1) Temporally extended training. High-level actions correspond to multiple environment steps. To the high-level agent, episodes are effectively shorter. Thus, rewards are propagated faster and learning should improve.
(H2) Temporally extended exploration. Since high-level actions correspond to multiple environment steps, exploration in the high-level is mapped to environment exploration which is temporally correlated across steps. This way, an HRL agent explores the environment more efficiently. As a motivating example, the distribution associated with a random (Gaussian) walk is wider when the random noise is temporally correlated.
(H3) Semantic training. High-level actor and critic networks are trained with respect to semantically meaningful actions. These semantic actions are more correlated with future values, and thus easier to learn, compared to training with respect to the atomic actions of the environment. For example, in a robot navigation task it is easier to learn future values with respect to deltas in x-y coordinates rather than robot joint torques.
(H4) Semantic exploration. Exploration strategies (in the simplest case, random action noise) are applied to semantically meaningful actions and are thus more meaningful than the same strategies would be if applied to the atomic actions of the environment. For example, in a robot navigation task, it intuitively makes more sense to explore at the level of x-y coordinates rather than robot joint torques.
TL;DR: A large number of conclusions can be drawn based on empirical analysis. Here are few:-
In terms of the benefits of training, it is clear that training with respect to semantically meaningful abstract actions (H3) has a negligible effect on the success of HRL.
Moreover, temporally extended training (H1) is only important insofar as it enables the use of multi-step rewards, as opposed to training with respect to temporally extended actions.
The main and arguably most surprising, the benefit of the hierarchy is due to exploration. This is evidenced by the fact that temporally extended goal-reaching and agent-switching can enable non-hierarchical agents to solve tasks that otherwise can only be solved.
These results suggest that the empirical effectiveness of hierarchical agents simply reflects the improved exploration that these agents can attain.

450 views18:32

About

Blog

Apps

Platform