Pretrained agent with continual learning (ε=0.01)
Try Swingup Demo - harder task where pole starts hanging down | Double Pendulum - chaotic two-segment swingup