https://123dok.org/document/eqox8lkq-following-newton-direction-policy-gradient-parameter-exploration.html