Snake AI
Egocentric observation with symmetric augmentation
Loading AI model...
This may take a few seconds (70 MB)
0
Current Score
1
Episode
0
Max Score
This agent uses egocentric observation: the grid is rotated so the snake always "faces up", reducing complexity. Combined with symmetric augmentation, curriculum spawning, and extended credit assignment (gamma=0.999, horizon=256), this achieves strong performance on 20x20 Snake.
5-Channel Head-Centered Observations
The agent sees a 39x39 grid centered on the snake's head, rotated so it always faces "up":
- Channel 0 - Head: Always at grid center (19, 19)
- Channel 1 - Body: 1 at all snake segments (relative to head)
- Channel 2 - Food: 1 at food position (relative to head)
- Channel 3 - Length: Normalized snake length (snake_len / 400)
- Channel 4 - Walls: 1 at cells outside the board boundary
Head-centered observation combined with egocentric rotation gives the agent translation and rotation invariance.
Network Architecture (2x Scale)
- Algorithm: PPO (Proximal Policy Optimization)
- Input: 5 x 39 x 39 = 7,605 features (flattened)
- Backbone: FC 2048 + FC 1024 + FC 512 + FC 256 (with LayerNorm)
- Policy Head: FC 128 + FC 3 (left/straight/right)
- Total Parameters: 4.4 million
- Training: 256 parallel environments on Apple Silicon (MPS), gamma=0.999, horizon=256, curriculum spawning