Stream Q(λ) - CartPole

Pretrained agent with continual learning (ε=0.01)

Try Swingup Demo - harder task where pole starts hanging down | Double Pendulum - chaotic two-segment swingup

Loading pretrained agent...