Random Linear Network Coding For Scalable BlockDAG

About Me
I’m Gordon Murray, an independent researcher in the Kaspa community. My work focuses on improving Kaspa’s networking layer, including publishing a whitepaper on RLNC for BlockDAG propagation LinkeIn link. This post is a condensed version of that research, with additional ideas about enabling Kaspa to run over wireless mesh networks. The whitepaper was highlighted on my LinkedIn by Gilad Aviram, who encouraged me to share it here for discussion at the request of Michael Sutton. I should also note that I’ve been inspired by the pioneering work of Muriel Médard on network coding and the Optimum P2P project, which demonstrated RLNC’s real-world benefits in Ethereum.

Proposal: Applying Random Linear Network Coding (RLNC) to Kaspa’s P2P Layer (Including Wireless Mesh Networks)


Executive Summary
Kaspa’s BlockDAG currently achieves ~10 blocks per second (BPS) and aims for 100+ BPS. While consensus (GHOSTDAG) is efficient, the network layer remains a bottleneck:

• Redundant gossip floods overlapping transactions.
• Multiple block bodies per second stress bandwidth.
• Global latency skew reduces blue-set inclusion fairness.

I propose integrating Random Linear Network Coding (RLNC) into Kaspa’s P2P layer. RLNC optimizes propagation by transmitting linear combinations of blocks/transactions, so nearly every received shard is innovative.

This directly benefits:
Block propagation (reduces redundancy)
IBD (Initial Block Download) (faster sync, fewer stalls)
Mempool sync (efficient transaction relay)
• And critically, enables Kaspa to run over wireless mesh networks for maximum decentralization.


Layer Separation: RLNC is Consensus-Agnostic

┌─────────────────────────────────────┐
│ Application Layer                   │
├─────────────────────────────────────┤
│ Consensus Layer                     │
│ (Kaspa PoW: Mining + GHOSTDAG)      │
├─────────────────────────────────────┤
│ Network / P2P Layer                 │
│ ← RLNC OPERATES HERE →              │
├─────────────────────────────────────┤
│ Transport Layer                     │
└─────────────────────────────────────┘

RLNC optimizes the network layer. Consensus remains untouched: whether PoW or PoS, blocks must still propagate efficiently.


Current P2P Implementation (Rusty-Kaspa)
In kaspad/src/p2p/flows:
Block relayflows/block/handle_incoming.rs
Transaction relayflows/transaction/relay.rs
Initial Block Downloadflows/ibd

All rely on gossip-based flooding: every block or transaction is propagated individually. This ensures delivery, but at high BPS causes:

  1. Redundant transmission of overlapping transactions.
  2. Bandwidth stress from concurrent block bodies.
  3. Latency skew → slower blocks earn less blue score, reducing fairness.

RLNC Benefits for Kaspa

1. Block Propagation
• Encode multiple concurrent blocks into shards.
• Peers forward different shards → receivers decode originals once rank is full.
• Reduces redundant gossip.

2. IBD (Initial Block Download)
• Sync peers send coded shards over block ranges.
• New nodes don’t stall on missing blocks.
• Faster historical sync.

3. Transaction Relay
• Batch encode mempool transactions.
• Even if wireless peers miss shards, decoding succeeds once enough arrive.


Kaspa over Wireless Mesh Networks

Why RLNC is Ideal for Mesh
Wireless mesh networks are lossy, variable-latency, and retransmission-expensive. RLNC was originally developed for these conditions (multicast, satellite, ad-hoc).

Mesh-Specific Benefits
Multicast efficiency → one radio transmission helps multiple neighbors.
Partition healing → shards rapidly fill in missing data after splits.
Resilience → innovative shards survive packet loss.
Energy efficiency → fewer retransmissions save battery-powered nodes.

To our knowledge, no high-throughput PoW BlockDAG has ever been adapted to wireless mesh. Prior experiments used Hyperledger Fabric or PoA Ethereum with low block rates. Kaspa + RLNC would be the first truly high-rate permissionless blockchain optimized for mesh networks.


Example: RLNC in Rusty-Kaspa

// protocol/src/messages/rlnc.rs
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct RlncShard {
pub coefficients: Vec<u8>, // Random GF(2^8) coefficients
pub payload: Vec<u8>, // Encoded block/tx data
pub commitments: Vec<Hash>, // Merkle roots for integrity
}

Extend block relay (flows/block):

fn propagate_blocks_rlnc(blocks: Vec<Block>) {
let shards = rlnc::encode(blocks, redundancy_factor);
for (peer, shard) in peers.iter().zip(shards) {
send_message(peer, Message::RlncShard(shard));
}
}

On receiving side:

fn handle_rlnc_shard(shard: RlncShard) {
decoder.add_shard(shard);
if decoder.is_ready() {
let blocks = decoder.decode();
for block in blocks {
validate_and_insert(block); // consensus unchanged
}
}
}

Quantitative Outlook (Whitepaper Results)
Formal model: RLNC improves finality when

3(1−η)τ > δ

where:
• η = efficiency factor (bandwidth savings)
• τ = network mixing time
• δ = decode overhead

Example from whitepaper:
• With W=10 blocks, ρ=0.6 overlap → η ≈ 0.46 (54% savings).
• τ ≈ 80ms → RLNC can tolerate δ up to ~130ms.
• Optimized decoding achieves δ < 5ms.

Thus RLNC yields 40–60% bandwidth savings and reduced ordering lag, improving blue-set fairness.


Deployment Strategy

  1. Phase I: Systematic RLNC for block bodies (send some uncoded chunks + coded shards).
  2. Phase II: Transaction batch gossip.
  3. Phase III: Multi-tip coding across sliding windows.
  4. Phase IV: Test with simpa under lossy/mesh conditions.
  5. Phase V: Opt-in rollout (--enable-rlnc).

This roadmap is backward-compatible and can begin in private relay overlays before public deployment.


Conclusion
RLNC directly addresses Kaspa’s network bottlenecks:
• Reduces redundant gossip traffic.
• Speeds up IBD and transaction relay.
• Improves blue-set inclusion fairness at high BPS.
• Enables Kaspa to run over wireless mesh networks, maximizing decentralization.

Bottom Line: RLNC is not just a bandwidth optimization — it is a novel enabler that could make Kaspa the first high-throughput PoW BlockDAG blockchain to run efficiently on mesh networks.


Call to Action
I invite Kaspa devs and researchers to:

  1. Review the proposed RlncShard message type and relay flow.
  2. Use simpa with lossy network parameters to emulate mesh conditions.
  3. Discuss feasibility, trade-offs, and next steps.

With RLNC, Kaspa can scale not just in datacenters but across community-built mesh deployments, making it censorship-resistant, resilient, and globally distributed.

5 Likes

Apologies for a belated response.

Its great to see such an initiative.

I realize part of this is background that I may be missing - but I wish for a more thorough explanation on the proposed scheme - for example - what exactly are the packets? are these random sections of data or do they correspond to something concrete (transactions/header hashes)?

Ignoring mempool, redundancy on Kaspa blockdag proper has two main sources:

  1. a transaction is often shared by several “parallel” blocks
  2. in the header, parents’ (direct and indirect hashes) are shared and repeated by many many blocks.

It has not been clear enough to me if this is what we are trying to optimize or something else. If that is it, please help us understand in detail how the reconstruction of a block looks like.

On practical terms - I believe the most substantial place where something like this could be employed is during IBD.

Could a malicious actor perform a DoS attack by crafting blocks with intentionally flawed encodings?

For example, by generating shreds where the linear combinations are dependent or designed such that no shred (or combination of shreds) includes/reveals a specific critical piece of the block data.

This could cause nodes across the network to receive and store “almost complete” blocks in memory indefinitely while waiting for sufficient independent shreds to decode, but never actually succeeding. Repeating this for multiple blocks could exhaust memory resources network-wide.
Is this feasible under the proposed RLNC scheme? What mitigations could prevent it?

FreshAir08, thanks reading my post.

A Kaspa block is conceptually similar to a Bitcoin block, but adapted for parallel DAG consensus. Each block encapsulates a set of transactions and references multiple parents instead of a single previous block. This enables high-throughput, low-latency block production while maintaining eventual ordering.

Basic block structure
Each block consists of:

Header core — version, timestamp, nonce, difficulty, and Merkle roots for transactions and accepted parents.
Parent hashes — a list of one or more parent block references (Kaspa typically 1–3).
Transaction list — user transactions selected by the miner.
Blue score / selected parent hash — values used in consensus to define ordering and cumulative work.
Subnetwork data (optional) — extra fields reserved for extensions like proof-of-stake, virtual programs, or L2 metadata.

In simplified pseudocode:

Block {
    Header {
        Version
        Timestamp
        Difficulty
        MerkleRootTx
        MerkleRootParents
        SelectedParentHash
        BlueScore
        Nonce
    }
    Parents[] = { hashA, hashB, ... }
    Transactions[] = { tx1, tx2, ... }
}

How it’s propagated
Unlike Bitcoin’s strictly linear chain, Kaspa nodes can receive and validate multiple blocks concurrently. Each block declares its set of parents, allowing it to “attach” to the DAG even before all tips are fully synced. This makes parallel IBD and asynchronous validation possible.

Relation to RLNC-based IBD
Because a Kaspa block’s payload (header + parents + txs) can be partially redundant with others in the same DAG layer, it’s a good candidate for coded propagation. My recent simulation demonstrates how Random Linear Network Coding (RLNC) could exploit this redundancy for faster and more resilient IBD.

Experiment repo: https://github.com/gordonmurray-coding/rlnc-kaspa-ibd-demo

The model factors the block payload into:

  • Header core (constant-size bytes)
  • Parent-hash items (shared among overlapping tips)
  • Transaction bytes (with configurable overlap probability)

By treating each of these as linear symbols over GF(2) or GF(256), RLNC achieves up to 40–50 % byte savings compared to naïve full-block transfer, while remaining loss-tolerant under high packet drop rates.

Key takeaway
Kaspa blocks are composable objects; header, parent references, and transactions, forming the vertices of a global DAG. Their partial redundancy between tips makes Kaspa a natural candidate for coded networking approaches like RLNC, which can reduce bandwidth and improve IBD convergence even under lossy conditions.

This is in reply to Maxim’s question about DoS. I’ve created a MATLAB-based simulation to examine this issue; whether dependent or malformed coded packets could realistically cause a DoS-style stall in an RLNC-based IBD process.

The model emulates a Kaspa-like blockDAG with configurable:

  • Level width and parent fan-in
  • Sliding-window IBD with caching
  • Per-peer bandwidth, RTT, and packet loss
  • GF(2) and GF(256) fields
  • Rateless RLNC with feedback and repair batches

Each window compares three strategies:

  1. Full-block transfer (no deduplication)
  2. Unique pull (ideal de-dup baseline)
  3. RLNC coded download (loss-aware, adaptive)

Result summary (20 Mbps / 5 peers / 20 % loss):

  • RLNC reduced bytes by ≈ 40–50 % vs full-block IBD
  • Time savings ≈ 30–40 %
  • All windows reached full rank = K before timeout
  • GF(256) more stable under loss

demo console

DoS feasibility
A malicious node could indeed craft dependent combinations to stall decoding, but this can be bounded and detected:

Rank-growth monitoring: reject peers whose packets stop increasing rank
Per-window repair limits: cap accepted coded packets to ≤ K (1 + ε + margin)
Deterministic coefficient seeds: tie each packet’s coefficients to its hash/seed for verification
Rank-feedback gossip: allow peers to cross-validate decoding progress without exposing payloads

These mitigations keep memory bounded and prevent indefinite waiting on undecodable shards.

Next steps

  • Integrate rank-aware feedback into the simulation
  • Quantify computational cost of rank checks in live node context
  • Explore coded relay / partial sub-DAG propagation
  • Integrate with rusty-kaspa-simpa for modeling & simulation

Would appreciate feedback on practical validation overheads or other mitigation primitives that could coexist with Kaspa’s concurrency model.

we can skip Kaspa premiers - for the record parent hashes is not only direct parents but also indirect parents which is why its not “1-3” (that too, I think, is more like 7 since crescendo), rather several hundred hashes and is a major part of redundancy.

But lets just focus on block data (transactions) - assume for now header data is propagated us usual.

The part I’m unfamiliar with is RLNC - I tried reading but I’m not sure how this works - it sounded to me like this is something that is good for lossy scenarios not so much for inherent redundancy. I could be wrong - but I need you to explain it.

Please get into and explain (not code!) the scheme you are suggesting in practice to the highest degree of detail as well as what it pertains to solve.

What RLNC is in plain English for transactions:
RLNC turns a large, overlapping set of missing transaction bytes into a stream where any useful packet helps until you finish. Instead of asking peers for specific transactions which risks duplicates, stalls, and repeated request rounds, peers send small coded shreds that are linear combinations of the chunks you are missing. Your node keeps only shreds that add new information and drops the rest as soon as they arrive. Once you collect enough innovative shreds, you recover all original chunks at once and reassemble the transactions.

Why this helps Kaspa even when links are not very lossy:
It is primarily about coordination and duplication across many peers and many parallel blocks, not only about packet loss.

  • Cross block duplication becomes a single download. If the same transaction appears in multiple parallel blocks, RLNC treats it as one source item inside a small sliding window. You fetch it once and it satisfies all those blocks.
  • Any useful packet means no per tx choreography. With many peers, classic pulls waste bandwidth on duplicates and bookkeeping about who has which transaction. Under RLNC, whichever peer sends a coded shred, it is very likely useful until you finish.
  • Tail removal and endgame elimination. The last one to five percent of transactions often dominate wall clock time. RLNC adds a small planned overcode, typically five to ten percent, so you finish without hunting for specific transactions.
  • Partial peers still help. Even if a peer has only some of the window, its shreds are still useful. RLNC harvests partial availability automatically.

What exactly are the packets, for transactions only:

  • Build a sliding window of recent levels for IBD or catch up, for example four to eight levels.
  • Compute the node’s unique set of missing transactions across that window. Duplicates across blocks collapse to one.
  • Split each transaction into fixed size chunks, for example 512 to 1024 bytes. Across the window you get K chunks that are the sources.
  • A coded shred equals a tiny seed, for example four bytes, plus a payload that is the bytewise linear combination of those K chunks over GF(256). The seed deterministically expands through a public PRF to the coefficient row, and the receiver recomputes it for verification.

What the receiver does, and why memory remains bounded:

  • For each shred, regenerate coefficients from the seed, then run one elimination step.
  • If it increases rank, keep it. If it is dependent, drop it immediately. You do not buffer junk.
  • When rank reaches K, solve once, recover all chunks, reassemble transactions, and validate.
  • Memory is bounded at roughly K multiplied by chunk size for the innovative rows plus a small coefficient structure per active window, with strict caps and timeouts.

Why RLNC is not only for lossy networks and fits inherent redundancy in Kaspa:
The main win is removing the coordination cost that comes from overlap.

  • In a multi peer pull, different peers often resend the same last few transactions. RLNC turns those overlaps into independent information, so they still push you toward completion.
  • The tail disappears because you do not need those specific last transactions. You need enough degrees of freedom. A small overcode in the range of five to ten percent provides that cushion with no per tx negative acknowledgments.

Scheme in practice:

  • Window selection on the receiver: choose recent levels, list the transactions you are missing, deduplicate, chunk, and count K.
  • Coefficient rule that is safe against DoS: each shred carries a seed. Coefficients derive from a public PRF of window id, stream id, and shard index. The receiver can dictate the stream id so peers cannot choose pathological schedules. Schedules can be prechecked for rank growth if desired.
  • Sending by peers: stream ratelessly about K times one plus epsilon coded shreds, with epsilon around five to ten percent. Multiple peers can send, no need to coordinate who has which transaction. A short systematic phase is optional on very clean links.
  • Receiving by the node: innovative or drop, stop at rank K, decode once, reassemble transactions, validate. Cache recovered transactions so later blocks do not re download them.
  • Bounds and fallback: keep a small fixed number of windows in flight, apply hard caps on rows and time. If progress stalls, request a small top up or fall back to per tx or per chunk for the last few, which is the same safety net used today.
  • Peer quality control: track an innovation ratio per peer, namely innovative over received. Peers that send mostly non innovative or invalid shreds are deprioritized or banned.

What this solves for Kaspa:

  • Cross block transaction duplication becomes a single download inside the window.
  • Multi peer overlap stops costing you since any helpful packet advances you.
  • The long tail and endgame vanish since a small overcode replaces rounds of requests for specific transactions.
  • IBD and catch up time becomes bandwidth proportional rather than RTT and coordination dominated.
  • Memory stays bounded using innovative only buffering with sliding window caps.
  • Partial peers still help without any per tx scheduling.

Non goals and expectations:

  • RLNC does not compress transactions and does not change consensus or validation. It is a transport and scheduling upgrade.
  • With a single perfect peer, zero loss, and an ideal scheduler, an oracle style unique pull can match or beat RLNC in bytes. RLNC earns its keep in Kaspa’s real conditions with parallel blocks, many peers, overlapping availability, and some loss.

TL;DR RLNC helps by streaming rateless coded shreds so progress isn’t gated by per-tx NACK/REQ exchanges; completion becomes closer to bandwidth-proportional rather than RTT-bound.