https://123dok.net/document/yr31829p-learning-optimal-policies-bellman-residual-minimization-fitted-iteration.html