1. Start Local
At depth 0, each node only carries its own feature vector or scalar state.
Explorable publication · graph neural networks
A graph neural network does not classify a node from its own features alone. It repeatedly moves signal across edges, compresses that neighborhood into a new hidden state, and expands what the node can know one hop at a time.
In a node-wise MLP, each example is judged in isolation. In a GNN, a node can also borrow evidence from its neighbors. That makes local structure part of the representation itself.
This publication keeps the math intentionally small: one scalar channel, weighted directed edges, and a few graph motifs that isolate message passing, receptive fields, and oversmoothing.
A node never receives the whole graph in one step. Each layer reads incoming messages, aggregates them, and writes the result back into the node state. Stacking layers repeats the same local rule, which is why depth grows the receptive field.
Figure 1 traces one update in slow motion. Figure 2 fixes the graph and changes depth, so you can see when extra context helps and when repeated averaging starts to erase structure.
At depth 0, each node only carries its own feature vector or scalar state.
Each directed edge scales the sender and pushes that signal toward a neighbor.
The receiver compresses its incoming neighborhood into a single summary.
Stacking layers repeats the same local rule, which increases reach and also increases mixing.
Figure 1
Loading scene…
0.874 = edge message value
Current Equation
--
--
Selected Node
--
Incoming Messages
01
At depth 0 the selected node only knows its own state. This is the MLP-like view.
02
Each sender pushes a weighted message along a directed edge into the selected node.
03
The receiver summarizes those messages into one neighborhood statistic.
04
The next-layer state blends the node's own feature with its neighborhood aggregate.
05
Use more layers only when the evidence sits further away. Each extra layer buys reach, but also mixes the graph more aggressively.
A 0-layer model is just a node-wise classifier: it only sees xv. One layer reaches immediate neighbors. Two layers reach neighbors-of-neighbors. Beyond that point, whether more depth helps depends on the graph.
In the two-hop witness motif, depth uncovers the useful evidence. In the bridge motif, extra depth gradually pulls both communities toward the same representation.
Figure 2
Loading scene…
Layer Depth
GNNs help when the label of a node genuinely lives in the neighborhood: homophily, motifs, bridges, bottlenecks, or evidence that sits a few hops away. They struggle when the graph is noisy, when locality is the wrong inductive bias, or when too much depth makes node states collapse together.
Only sees the local feature. Great when the node already contains the answer.
Reads one-hop context. Useful when nearby neighbors already contain the needed evidence.
Reaches motifs and relays. Two layers is usually the right call for small-graph tasks.
Repeated averaging can flatten the graph. Distinctions shrink even when the answer stays the same.
A GNN is not magic global reasoning. It is a local update rule applied repeatedly to a graph. Every extra layer trades more context for more mixing. The practical question is not “should I use a GNN?” but “does my node label actually live in the neighborhood, and if so, how many hops away?”