KIP 311: Permissionless P2P Network Topology Source

AuthorIan, Lake, Lewis, Ollie
Discussions-Tohttps://github.com/kaiachain/kips/issues/111
StatusDraft
TypeStandards Track
Created2026-04-20
Requires 286, 290

Abstract

This KIP specifies the P2P network topology and peer-admission rules that Consensus Nodes (CN), Endpoint Nodes (EN), and the bootstrap node (BN) MUST follow after the permissionless hard fork. It defines the target network shape (a CN full mesh, with ENs connecting directly to CNs and all node types bootstrapped through a unified BN), the three-layer protocol stack (Node Discovery, RLPx, Kaia P2P), and the policies for each node role. CN admission is governed by the on-chain validator state defined in KIP-286 and KIP-290: only validators whose AddressBookV2 state lies in a well-defined CNPeers set are accepted as CN peers. The Proxy Node (PN) role is retired: existing PN deployments remain operational for backwards compatibility but MAY be deprecated without notice.

Motivation

In the pre-fork Kaia P2P network, CN addresses are registered to CNBN through a manual process: operators edit configuration and call RPCs such as PutAuthorizedNodes to keep CNBN’s allowlist in sync with the current validator set. ENs do not connect to CNs directly — they reach the consensus network through PNs (Proxy Nodes), which sit between the CN mesh and end-user traffic and are themselves discovered via ENBN. This arrangement works because the validator set is permissioned and rarely changes, so manual curation is tractable.

Permissionless operation changes three things at once, and the P2P layer has to accommodate all three:

  • CNs self-register. Validators join and leave through the on-chain lifecycle defined in KIP-286, with state recorded in the AddressBookV2 contract. Operators no longer curate CNBN’s peer list, so CN admission MUST be derived from live on-chain state and MUST react to state transitions without manual intervention.
  • PN is retired with backwards compatibility. Existing PN deployments remain operational but PN is no longer part of the post-fork topology, and MAY be deprecated without notice. ENs can now connect directly to CNs without going through a PN, which means CNs MUST accept and serve EN peers without degrading consensus latency among themselves.
  • Bootstrap node is no longer role-specific. With PN retired from the post-fork topology and CNs self-registering, the pre-fork split between CNBN (for CNs) and ENBN (for ENs and PNs) loses its purpose. A single unified BN bootstraps every node type.

Without these rules codified at the protocol layer, nodes from different clients cannot interoperably converge on the same mesh after the permissionless hard fork.

Specification

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

Parameters

Connection policy is governed by capacity inputs and target maps. Each target map is indexed as targets[observer type][counterparty type].

Capacity Inputs

  • MaxPhysicalConnections (M): maximum total physical peer connections, inbound and outbound combined.
  • DialRatio (R): ratio used to derive the EN-to-EN dynamic dial target and inbound capacity. If unset or zero, R = 3.
  • maxENToENDialTarget: ⌊M / R⌋ for an EN with dynamic dialing enabled; 0 when dynamic dialing is disabled.

The EN-to-EN dial target preserves the pre-fork DialRatio semantics. It controls how many EN peers an EN actively tries to create through dynamic outbound dialing. Accepted EN peers are still bounded by MaxPhysicalConnections and inbound capacity; there is no separate EN-to-EN per-type cap. CN dial targets are fixed type targets and do not use DialRatio.

discoverTargets

Target number of entries in the observer’s discovery table per counterparty type. means no cap.

Observer \ Counterparty CN EN BN
CN 100 1 3
EN 100 3
BN 100 3

dialTargets

Target number of outbound peer connections the observer maintains per counterparty type.

Observer \ Counterparty CN EN BN
CN 100 1
EN 2 maxENToENDialTarget

peerTargets

Per-type admission caps used to preserve scarce connection slots. means there is no additional per-type cap beyond MaxPhysicalConnections and inbound capacity.

Observer \ Counterparty CN EN BN
CN 3
EN 2

Finite peerTargets protect CNs from excessive EN peers and ENs from excessive CN peers. CN-to-CN and EN-to-EN peers are governed by global physical capacity instead.

Terms

  • : the set of all CN nodes.
  • R(n): node n is registered in AddressBookV2.
  • state(n): assuming R(n), the AddressBookV2 state of node n as defined in KIP-286.
  • CNPeers: { n ∈ ℕ | R(n) ∧ state(n) ∈ { CandReady, CandTesting, ValActive, ValReady, ValPaused } }.
  • nonCNPeers: ℕ \ CNPeers, equivalent to { n ∈ ℕ | ¬R(n) ∨ (R(n) ∧ state(n) ∈ { Registered, ValInactive, ValExiting }) }.
  • ConnType: the peer-type label declared by each side at the Layer 2 RLPx handshake. Post-fork values for new topology: { CN, EN, BN }. PN remains a valid legacy value for backwards compatibility with existing deployments, and is treated equivalently to EN in all post-fork requirements.
  • Discovery ping: UDP ping in the Node Discovery Protocol (as opposed to “RPC call” for on-chain/HTTP queries).
  • Inbound connection: a TCP/RLPx connection initiated by the remote node toward the local node.
  • Outbound connection: a TCP/RLPx connection initiated by the local node toward the remote node.
  • Trusted peer: a peer explicitly configured by the local operator as trusted, for example through a trusted-nodes.json file. Trusted peer configuration grants bidirectional operator trust for inbound and outbound connections.
  • Static peer: a peer explicitly configured by the local operator as static, for example through a static-nodes.json file. Static peer configuration is a local outbound dialing preference and does not grant inbound authorization.
  • MaxPhysicalConnections: the node’s configured maximum total physical peer connections, inbound and outbound combined.
  • DialRatio: the node’s configured ratio used to derive EN-to-EN dynamic dialing and EN inbound capacity. If unset or zero, it defaults to 3.
  • Inbound capacity: the node’s limit on inbound physical peer connections. For ENs and legacy PNs, this is MaxPhysicalConnections - ⌊MaxPhysicalConnections / DialRatio⌋; for CNs and BNs, this is MaxPhysicalConnections.

The SuspendedSet maintained by AddressBookV2 does NOT affect P2P admission or routing decisions.

Unless explicitly stated otherwise, all requirements below apply at Layer 2 (RLPx transport) of the protocol stack.

Post-HF Network Topology

flowchart LR
    BN[("BN")]
    EN["EN"]
    Cand["Candidate"]
    subgraph CNmesh["CN full mesh (CNPeers)"]
        CN1["CN"]
        CN2["CN"]
        CN3["CN"]
        CN1 --- CN2
        CN2 --- CN3
        CN1 --- CN3
    end
    BN -. bootstrap .-> CN1
    BN -. bootstrap .-> EN
    BN -. bootstrap .-> Cand
    EN --- CN2
    Cand --- CN3

Solid edges are persistent peer connections (CN–CN mesh, direct EN–CN). Dashed edges are bootstrap introductions only; BN does not relay traffic after discovery.

  • BN is the single bootstrap node for all node types, replacing the pre-fork CNBN (for CNs) and ENBN (for ENs and PNs). No per-type bootstrap nodes exist post-fork.
  • The Proxy Node (PN) role is retired at the permissionless hard fork. Existing PN deployments remain operational for backwards compatibility but MAY be deprecated without notice; no new PN requirements apply post-fork.

Protocol Stack

The three-layer protocol stack described below already exists in the pre-fork Kaia client and is not changed by this KIP. It is reproduced here as background so that the requirements in the following sections can refer to specific layers without ambiguity.

Layer Protocol Purpose
Layer 1 UDP / Node Discovery Node advertisement, mutual reachability proof (bonding via PING/PONG), neighbor lookup (FINDNODE/NEIGHBORS).
Layer 2 TCP / RLPx Encrypted session establishment, NodeId ownership proof via ECDH, peer capability negotiation (protoHandshake).
Layer 3 Kaia P2P Chain and network identity confirmation via StatusMsg, followed by protocol messages (consensus, block sync, tx propagation).

A single node key (an ECDSA private key over the secp256k1 curve) is used across layers:

  • Layer 1 signs every UDP packet (per-message authentication).
  • Layer 2 authenticates once via ECDH (session authentication, followed by symmetric encryption).

Requirements

R1. Network Topology

After the permissionless hard fork, the target topology is:

  • CNs MUST form a full mesh with all CNPeers except themselves.
  • CNs and ENs MUST connect directly; the PN relay layer is removed from the post-fork topology.
  • ConnType == PN MUST remain accepted for backwards compatibility and MUST be treated equivalently to ConnType == EN.
  • BN is the single bootstrap role for all node types, replacing the pre-fork role-specific bootstrap nodes.

R2. Bootstrap Node

A CN MUST bond with every configured BN address, not just a subset.

A BN:

  • MUST accept UDP ping (bond) requests from any node type.
  • MUST NOT use CNPeers membership or AuthorizedNodes as a discovery admission filter.
  • MUST use random neighbor selection, so that responses are not biased toward an operator-curated subset.
  • MUST enforce a rate limit on discovery pings from unknown NodeIds to prevent amplification and DoS attacks against the BN UDP endpoint.

R3. Discovery and Dialing

Nodes MUST apply the discoverTargets and dialTargets parameters defined in the Parameters section.

  • A node MUST maintain at least discoverTargets[observer][X] bonded entries in its discovery table for each counterparty type X where the parameter is defined. When the count falls below target, the node MUST initiate additional Node Discovery lookups.
  • A node MUST maintain up to dialTargets[observer][X] outbound peer connections for each counterparty type X where the parameter is defined. When the outbound count falls below target, the node MUST dial additional peers of that type. For EN-to-EN dialing, the target is derived from MaxPhysicalConnections and DialRatio as defined in Parameters.
  • For CN dynamic outbound dials after the hard fork, CN-typed dial candidates MUST be selected from CNPeers.
  • Static outbound dials are operator-requested dials and are not restricted to CNPeers.
  • Discovery and dialing accounting MUST treat ConnType == PN as ConnType == EN.

R4. CN-CN Admission

CN-CN admission is governed by CNPeers. When admitting a peer connection whose counterparty claims ConnType == CN, a CN MUST validate the counterparty’s AddressBookV2 registration and current CNPeers membership before Layer 3 message processing begins.

Connection class CNPeers auth
Trusted peer Exempt
Static outbound Exempt
Static inbound Enforced
Dynamic inbound Enforced
Dynamic outbound Enforced
EN, BN N/A

For this table:

  • Enforcement MUST occur at Layer 2 once the remote NodeId, and thus the address derivable from it, is known.
  • A non-exempt node claiming CN without being in CNPeers MUST be rejected.
  • EN and BN connections do not require CNPeers authorization.
  • A static inbound connection is non-exempt unless the peer is also trusted.

R5. Per-Type Peer Budgets

Nodes MUST apply existing physical-capacity checks and finite peerTargets caps. This preserves the pre-fork MaxPhysicalConnections and inbound capacity semantics, and adds per-type caps only where one connection type must be protected from another.

Connection class MaxPhysicalConnections Inbound capacity peerTargets
Trusted peer Exempt Exempt Exempt
Static outbound Exempt N/A Exempt
Static inbound Enforced unless trusted Enforced unless trusted Enforced if finite
Dynamic inbound Enforced unless trusted Enforced unless trusted Enforced if finite
Dynamic outbound Enforced N/A Enforced if finite

For this table:

  • peerTargets is an additional per-type admission cap. means no per-type cap beyond MaxPhysicalConnections and inbound capacity.
  • A node MUST NOT accept a non-exempt peer connection that would cause its total peer count for type X to exceed a finite peerTargets[observer][X].
  • ConnType == PN MUST be counted as ConnType == EN for peerTargets.
  • Exempt peers MUST still count toward observed peer totals and dial scheduling.
  • A node MUST NOT open additional dynamic outbound peers for type X when existing outbound or dialing peers for X already fill dialTargets[observer][X], including exempt peers.
  • For observability, implementations can log exempt connections with the remote NodeId, derived address, declared ConnType, connection direction, and exemption reason.

R6. CNPeers Reconciliation

CNs MUST reconcile existing CN peer connections when CNPeers changes.

Connection class State-transition disconnection
Trusted peer Exempt
Static outbound Exempt
Static inbound CN peer Enforced
Dynamic inbound CN peer Enforced
Dynamic outbound CN peer Enforced
EN or BN N/A

For non-exempt CN peers:

  • A CN MUST disconnect from a CN upon observing that CN’s AddressBookV2 state transition into nonCNPeers.
  • A CN SHOULD initiate graceful disconnection from its CN peers upon observing its own AddressBookV2 state transition into nonCNPeers.
  • Implementations SHOULD attempt a bounded graceful drain before hard-closing to avoid losing in-flight consensus messages.

R7. CN-EN Sync Path

Direct CN-EN connectivity is the post-fork sync path for nodes outside the CN mesh.

  • A CN MUST accept UDP ping (bond) requests from ENs.
  • A CN MUST accept peer requests from ENs, subject to R5.
  • A CN in nonCNPeers MUST sync blocks through EN-facing connectivity, since it will be rejected by other CNs under R4.
  • If a CN intends to become a candidate, it MUST be fully synced before transitioning to CandReady.
  • If a CN is in ValInactive, it MUST be fully synced before transitioning to ValReady.
  • The dialTargets[CN][EN] floor enforced by R3 and the peerTargets[CN][EN] budget enforced by R5 ensure block-sync availability across state transitions and network partitions, regardless of the CN’s own validator state.

Changes from Existing Implementation

This KIP changes the existing P2P implementation as follows:

  • R1 Target Network Topology: CN-CN connectivity changes from the permissioned/static topology to a full mesh over CNPeers; PN is retired from the target topology and treated as EN for compatibility.
  • R2 Bootstrap: BN discovery changes from optional AuthorizedNodes filtering to open discovery with unknown-node rate limiting and unbiased neighbor sampling.
  • R3 Discovery and Dialing: Discovery and outbound dial targets become post-fork type targets; CN dynamic outbound CN dials are restricted to CNPeers.
  • R4 CN-CN Admission: CN admission changes from late peer-type validation during Kaia protocol registration to Layer 2 CNPeers authorization before Layer 3 message processing.
  • R5 Per-Type Peer Budgets: The existing MaxPhysicalConnections and inbound capacity checks remain, and finite peerTargets add targeted CN/EN type protection; PN is counted as EN.
  • R6 CNPeers Reconciliation: CNs disconnect non-exempt CN peers that transition into nonCNPeers.
  • R7 CN-EN Sync Path: EN-to-CN connectivity becomes the post-fork sync path for CNs outside CNPeers, replacing reliance on the retired PN layer.

Premises

  1. BN is the sole bootstrap entry point for all node types; the pre-fork CNBN (for CNs) and ENBN (for ENs and PNs) are unified into a single BN.
  2. CN admission is decided directly between CNs, using AddressBookV2 state as the source of truth for CNPeers membership. BN performs no admission filtering.
  3. Dial information (endpoint = IP, UDP port, TCP port) is learned at the network layer via bonding and handshake. AddressBookV2 carries only { NodeId, state }, not endpoint.
  4. The SuspendedSet does NOT affect P2P-layer authorization.
  5. NodeId is immutable. Key rotation requires a full deleteNode() followed by re-onboarding.
  6. |CNPeers| ≤ MaxNodeCount as defined in KIP-286. CN full mesh capacity is provided by MaxPhysicalConnections; peerTargets[CN][CN] = ∞ means there is no additional CN-to-CN per-type cap beyond global physical capacity.

Rationale

Why CNPeers excludes ValInactive and ValExiting

A CN that may participate in consensus at the next epoch must already be peered into the mesh before the transition — bootstrapping into the mesh during the transition would introduce connection latency at the moment consensus needs the node online. This drives CNPeers membership:

  • ValActive, ValPaused: currently participating, or able to resume at any block; stay peered.
  • ValReady: may be promoted to ValActive at the next epoch if it makes the top-50 by stake; must be peered beforehand so the promotion is a no-op at the P2P layer.
  • CandReady, CandTesting: transitioning toward consensus participation (CandReadyCandTesting → possibly ValActive); must be peered so that VRank testing actually exercises consensus.

By contrast, ValInactive and ValExiting cannot reach ValActive at the next epoch from their current state:

  • ValInactive must first transition to ValReady via an explicit user tx; it enters CNPeers only at that point.
  • ValExiting transitions to ValInactive at the next epoch, not to an active state.

Including either in CNPeers would consume CN peer slots and consensus bandwidth without unlocking any participation path.

Why admission lives at the receiving CN via R4

Validator-set membership changes dynamically via AddressBookV2, so a pre-fork-style BN allowlist cannot stay current; admission therefore moves entirely to the receiving CN. BN performs no admission check — it accepts pings from any node type, serves unbiased NEIGHBORS responses, and relies on R2 rate limiting as its only DoS gate. BN establishes no Layer 2 peer connections, so there is no peer-slot budget at BN for admission to protect. R4 enforces CNPeers membership at the CN, scoped to ConnType == CN only: a node claiming EN self-downgrades into a smaller pool with no consensus traffic (peerTargets[CN][EN]), while a node claiming CN must pass CNPeers authorization before consuming consensus-facing capacity. This also preserves the legitimate case of a nonCNPeers CN (e.g., ValInactive) connecting as an EN to sync under R7. Distinguishing the malicious CN-as-EN downgrade from a legitimate R7 egress would require peer-history tracking beyond CNPeers membership, with no compensating security benefit.

Why the existing discovery and dial protocol suffices for CN mesh convergence

A new CN converges into the full mesh within a few minutes using the existing discovery and dial protocol alone. The total time decomposes as:

  • BN bond: one UDP PING/PONG RTT with each configured BN.
  • Discovery walk: iterative FINDNODE/NEIGHBORS rounds against BN and each bonded peer, until the discovery table reaches discoverTargets[CN][CN].
  • Candidate bonding: one PING/PONG RTT per newly learned CN, parallelizable across candidates.
  • Layer 2 dial: TCP connect + RLPx ECDH handshake per CN that passes R4.
  • Layer 3 handshake: StatusMsg exchange per dialed CN.

Each per-peer step is bounded by a single RTT, and |CNPeers| ≤ MaxNodeCount = 100 (Premise 6) bounds the total work. discoverTargets, dialTargets, MaxPhysicalConnections, and finite peerTargets are deliberately sized so that every CN can maintain the full CNPeers mesh under R1 without starving EN slots. discoverTargets[CN][CN] = 100 is chosen well above the realized |CNPeers|: at most 50 ValActive slots (the consensus cap) plus the ValPaused, ValReady, CandReady, and CandTesting slot counts, so the discovery table reaches a complete view of the mesh with ample headroom for transitional churn.

Backwards Compatibility

This KIP requires the permissionless hard fork. Before the fork block, legacy P2P behavior (static AuthorizedNodes, permissioned CN set, PN role active) continues unchanged. After the fork block:

  • The PN role is retired. Existing PN deployments remain operational for backwards compatibility but MAY be deprecated without notice. All post-fork nodes MUST treat ConnType == PN equivalently to ConnType == EN, so no PN-specific handling is required.
  • CNs MUST admit peers based on CNPeers membership, not a static allowlist.
  • static-nodes.json no longer grants inbound CN authorization after the fork. It only configures local static outbound dials, subject to the static outbound exemptions in R4, R5, and R6.
  • Nodes that do not implement R4 will still interoperate as EN peers, but will be rejected on CN-claimed connections. Non-conforming CNs cannot join the post-fork mesh.

Clients depending on pre-fork P2P assumptions MUST review this document and update their implementations accordingly.

Security Considerations

CN peer-slot exhaustion

The primary sybil threat is a non-CN attempting to consume CN physical capacity and consensus-facing peer slots by claiming ConnType == CN. R4 rejects these attempts before Layer 3 processing, before they can consume consensus-facing capacity beyond a single Layer 2 handshake. R2 rate limiting separately protects the BN discovery endpoint from unknown-node floods.

BN as an attack target

Because BN performs no admission filtering, it accepts discovery pings from any node — the pre-fork AuthorizedNodes allowlist no longer gates traffic. R2 rate limiting mitigates amplification and DoS risk. Operators MUST provision BN with capacity and network-layer protections appropriate to an internet-reachable UDP service without peer-list filtering.

Race between state transition and peer admission

A candidate that calls readyCandidate at block N may be briefly rejected by CNs that have not yet observed block N. R4 does not offer a grace window for this race: introducing one would create a spoof window in which a non-member could transiently pass admission. The sub-second retry cost is accepted instead. Implementations MUST NOT add such a window.

Operator Exemptions

Trusted peers and static outbound peers are deterministic exemptions from R4 admission, R5 budget rejection, and R6 state-transition disconnection. This preserves an operational escape hatch (e.g., during a coordinated recovery) but widens the authorization surface. Static inbound connections deliberately do not receive this exemption, because treating static configuration as an inbound allowlist would reintroduce operator-curated CN admission. For observability, implementations can log exempt connections so operators can detect unintended exemption usage.

References

Copyright and related rights waived via CC0.