Metrics to test DAG performance

This post describes my suggestions for metrics against which the performance of Kaspa ought to be measured.

Preliminary about tx-gen and demand-supply dynamics for blockspace

In an interesting crypto environment, the demand for blockspace is higher than the supply. When this is not the case, namely, if the rate of incoming txns is lower than the rate of outgoing txns (namely, lower than the number of txns that the are added to the blockDAG per second), then the avg size of the mempool would be very small, miners would empty their entire mempool with almost every block they create, and still many blocks won’t be full (i.e., will include less txns than permitted).

Therefore, to avoid trivial non-interesting results, all tests should be in a setup where the demand for blockspace, i.e., the avg mempool size, is higher than the supply, i.e., the block size limit. This should be done by setting the parameters of the tx-gen accordingly. A basic result from queueing theory indicates that we should expect the mempool size to explode after a while. Practically, the mempool should be truncated to some relevant capacity, and the lowest-paying txns should be removed from it; this mempool capacity (the demand) must still be higher than the blockspace (the supply).

The tx-gen should generate txns with random fees, and should allow the tester to dictate the probability distribution of these fees. One generic way to implement this is (i) to define the parameter ղ representing the number of txns generated by tx-gen per second, (ii) choose the fee of each generated txn from a finite discrete set of fees (e.g. fee_array={0.1, 0.25, 0.4, 1.0, 3.0} KAS} according to some probability distribution. The distribution should typically provide a lower weight to high-paying txns (e.g., to 3.0 KAS in the example above). This probability distribution will dictate, in particular, the variance of the fees in the mempool.

Metrics

  1. Transaction confirmation times

    a. First conf (insertion time) – # of seconds from the user pressing “send txn” to the txn being included in a block. This will help us assert the basic value proposition of Kaspa. If results are not good, we might decide to implement a direct-txn-relay-to-miners mechanism.

    b. Observed first conf – # of seconds from the user pressing “send txn” to the user seeing that the txn was included in a block. This metric is very similar to a.; it will set the marketing meme (as accordingly we will commit to the speed that end users should expect to observe).

    *Articles a and b should be checked mainly for high-fee txns; by high-fee I mean txns whose fee is high enough to ensure it is selected for the next block, with sufficient probability (say, >90%). Implementation of this was discussed in the preliminary section.

    c. Enough confs – # of seconds between the txn being included in a block to the txn having X confirmations. Check for X=10, 25, 40. Definition of “# of confirmations” should appear in Kaspa Documentation.

  2. Resource consumption – how much resources are consumed by a full node, under various throughput limitation setups. This will help us formalize the HW/SW requirements from Kaspa full nodes or/and change the current throughput regulation parameters (article b. below). It will additionally set the marketing tone.

    a. The resources to check, in order of priority:

    i. Bandwidth
    ii. CPU
    iii. RAM
    iv. Disk I/O
    v. Disk storage

    b. The throughput limitation parameters to play with:

    i. Block size limit
    ii. Mass limit
    iii. Mass pricings parameters – the parameters that define the mass-pricing ratio between txn size, script size, and cryptographic primitives (within the script).

    c. Importantly, aside from measuring the avg resource consumption under various throughput setups, we should stress-test the peak-throughput capacity. That is, what is the capacity of the fullnode when the throughput regulation parameters (article b.) are temporarily increased. This would help us understand whether enabling elastic throughput is useful for the system (it is not useful if the full node has similar peak and avg capacity).

  3. Goodput (aka throughput efficiency) – the percentage of unique txns included in the blockDAG, per second, on avg, for various throughput setups (i.e., various values of block size limit and mass limit). The percentage is calculated by dividing the number of unique txns included per second by the number of txns included per second. The goodput is smaller than 1 because the fast rate and asynchronicity of miners may lead to them duplicating txns, i.e., including the same txn across two or more parallel blocks.

    To mitigate the duplication or collision rate, and increase goodput, Kaspad miners select txns for their next block according to some sophisticated txn-selection randomization, inspired by the Inclusive Block Chain Protocols paper. The tests in 3. will help us understand whether the implemented heuristic is good enough, or rather we need to implement a better one.

    Notes:

    a. The goodput should be checked against various tx-gen distributions, i.e., distributions with varying variance values around the mean; see discussion in the preliminary section.

    b. We expect to see a high goodput (close to 1) when the variance is low, and small goodput (close to 0) when the variance is high. For intuition: When all txns have the same fee, miners will choose uniformly among all txns in the mempool, and since the mempool size will theoretically grow indefinitely (in this setup), the collision rate will go to 0 and the goodput will go to 1. Conversely, when there are very high-paying (=txns with high fee) txns in the mempool, all miners would greedily choose them, leading to a high collision rate and a goodput that is going to 0.

  4. Non-linearity of rewards (aka economies of scale centralization dynamic) – in principle, miners that enjoy better connectivity to other miners will enjoy a higher revenue on avg, as they are more likely to win the fee of duplicate disputed txns. Similarly, miners that enjoy better connectivity to users (i.e., to the tx-gen), will enjoy a higher revenue, as they will get a head start on the high-paying txns. This is a negative phenomenon, as ideally we would want a miner’s revenue to be linear in its hashrate; this test will help us understand the magnitude of the non-linearity, and to decide accordingly whether to design solutions (e.g., fee-sharing mechanisms). Alternatively, if revenue is close to linear, we might decide to provide stronger incentives for chain blocks (this is needed for other considerations).

    Should be checked against various miners topologies (while keeping the users’ connectivity to the miners fixed), and under various tx-gen topologies (while keeping the miners’ topology fixed).

    Topologies include:

    a. Star
    b. Clique
    c. Random