Snake AI
Egocentric observation with symmetric augmentation
Loading AI model...
This may take a few seconds (55 MB)
0
Current Score
1
Episode
0
Max Score
This agent uses egocentric observation: the grid is rotated so the snake always "faces up", reducing complexity. Combined with symmetric augmentation (horizontal flips during training), this achieves strong performance on 20x20 Snake.
5-Channel Egocentric Observations
The agent sees a 22x22 grid (20x20 + wall padding) rotated so it always faces "up":
- Channel 0 - Head: 1 at head position
- Channel 1 - Body: 1 at all snake segments
- Channel 2 - Food: 1 at food position
- Channel 3 - Length: Normalized snake length (snake_len / 400)
- Channel 4 - Walls: 1 at border cells
By rotating the grid based on the snake's direction, the agent learns direction-invariant patterns.
Network Architecture (2x Scale)
- Algorithm: PPO (Proximal Policy Optimization)
- Input: 5 x 22 x 22 = 2,420 features (flattened)
- Backbone: FC 2048 + FC 1024 + FC 512 + FC 256 (with LayerNorm)
- Policy Head: FC 128 + FC 3 (left/straight/right)
- Total Parameters: 4.4 million
- Training: 256 parallel environments on Apple Silicon (MPS)