Stream Q(λ) - CartPole

Pretrained agent with continual learning (ε=0.01)

Try Swingup Demo - harder task where pole starts hanging down

Loading pretrained agent...