Notes on devp2p
The bread and butter for the network's operation is devp2p. Devp2p makes use of a set of network protocols, like libp2p. Unlike libp2p, devp2p was made for the ethereum network, and could be applied for a few similar use-cases as well
There are many implementations of the protocol, as highlighted here
There are 3 big pieces of devp2p:
- The Node Record (What can you tell me about your node?)
- The discovery protocol (How do I find nodes?)
- The transport protocol (How do I send information to other nodes?)
1. The Node Record (What can you tell me about your node?)
-
The node record essentially behaves like a node's business card. It tells you it's name (nodeId), where you can find it (IP(v4 & v6) address), and it's role in the organization (the network)
-
There are multiple reasons as to why it is important to know all details about the node you are connecting to
-
For example, while syncing an archive node, the node would need to connect to other archive nodes to sync state. It can only do this if it knows the type of node on the other end
-
The structure of the node record can be found here
-
The node signs its record and broadcasts it to other nodes to initiate the connection. The process of signing and encoding the node record can be found here
2. The discovery protocol (How do I find nodes?)
-
Due to the decentralization of routing, each node must maintain its own DHT, that stores node records
-
There are various types of packets that are sent to and fro between nodes to establish and secure the connection between them. Technical spec regarding the packets can be found here
-
The records are stored in a Kademlia table
-
As messages are exchanged between nodes, buckets that store node Ids is filled up from least-recently seen to most-recently seen
-
These buckets are DoS resistant due to the design of the bucket structure
-
As time progresses, the Nodes DHT starts to fill up with "nearby" nodes, and can easily connect to nodes which are "active"
3. The transport protocol (How do I send information to other nodes?)
-
RLPx is used as the transport protocol to facilitate secure communication between nodes
-
Since each node holds a private key, it is used for signing messages that prove its identity, which is highlighted here
-
Framing is a method of encapsulating data that is to be communicated. It is primarily used to support multiplexing multiple protocols over one connection. Technical spec here
-
Any message that is sent after the handshake comes with a "capability", which is used to denote the functionality associated with the nodes
-
The multiplexing of connections is done based off the message ID, and each capability is given only as much message id space as it needs
Relevant Links
Thanks for reading!