Arxiu de l'autor: De Curtò i DíAz
Evolved Policy Gradients (EPG)
Evolved Policy Gradients, an experimental meta-learning technique to let agents rapidly learn to solve novel tasks: https://blog.openai.com/evolved-policy-gradients/
Publicat dins de General
Comentaris tancats a Evolved Policy Gradients (EPG)