(Hi, after much deliberation I realized that there exists a pruning attack on this algorithm. This was a result of trying to better formalize prunability and its security. The post below was written in an attempt to prove Yoni right but ended up in finding a subtle weakness, so it was edited accordingly. If you just want to read about the attack you can skip to the end)
I want to suggest a slightly more abstract framework to reason about the types of pruning algorithms we are interested in. I do this for the following reasons:
- It will make the proposal, and future proposals, much easier to understand and reason about
- It affords a straightforward definition of a pruning attack
- It allows me to cleanly define the correctness and safety of a prunability algorithm (I do not call it a protocol as it does require any communication between the peers) in a way which is independent of any particular proposal, and of the consensus protocol
- It might be easier to implement
However, this post is only half formal, both in order to make it accessible and because the ideas therein are not fully formed.
In this formalism, each node maintains a subset of the DAG \chi called the outcast set. These are blocks which the nodes includes in the DAG, but does not point to from any blocks it creates.
A prunability algorithm is the defined by specifying the following three:
-
T_{inv}: a condition for invalidating blocks,
-
T_{\chi}: a condition for adding a newly discovered block to \chi
-
N_p: a positive integer called the pruning parameter
The operation of an honest node upon learning of a new block B will then be:
- If T_{inv}(B) holds discard the block, else
- If T_{\chi}(B) holds, add the block to \chi
- If the block was not discarded (whether it was added to \chi or not), add it to the DAG and broadcast it to peers
We call a block invalid, outcast or valid respectively. Note that the status of a block is not in consensus (e.g. one node may consider a block valid while the other considers it outcast, etc.).
When constructing a block, it should point to all current tips which are not in \chi.
This allows us to formally define the desired properties of a pruning algorithm.
Definition (informal): A pruning attack is a scenario where an honest node creates blocks whose past includes blocks which are invalid by other honest nodes.
Essentially, in a pruning attack, an adversary creates a “borderline” block which will only be accepted by some of the nodes, and manipulates the nodes who accepted the rotten block to mine above it, whereby rendering their blocks invalid from the point of view of the rest of the network. The consequences depend on the actual algorithm. In the best case they make it cheap to manipulate nodes into wasting hash rate, in the worst case they can cause a permanent hard fork.
Definition (informal): A pruning algorithm is correct and secure if all of the following holds for any block B and two honest nodes N_1, N_2 (barring a 51% attacker):
- (safety margin) There will never exist a block B such that N_1 thinks B is valid and N_2 thinks that B is invalid
- (reconciliation) If N_1 thinks B is valid and N_2 thinks B is outcast then, shortly after, a block B' will emerge that has B in its past and that N_1 will consider valid
- (security) It is impossible to carry out a successful pruning attack
- (prunability) Let X be the most recent block in the selected of Virt with X.BlueScore < Virt.BlueScore - N_p, then any block which points to a block which is not in X.Future is invalid
Remark: In order to make the above formal, one should first prefix all deterministic properties other than prunability with “with overwhelming probability”, and then define what “shortly after” exactly means for reconciliation.
Yoni’s suggestion above could be recast in the outcast formalism. The algorithm is parameterized by a positive integer M.
For any block B let X_B be the most recent block in the selected chain of B such that X_B.BlueScore < B.BlueScore - M. Let V be the virtual block.
We say that a block B satisfies T_{\chi} if one of the following holds:
-
B has a parent which is not in X_V.Future
-
B is red and has a parent in \chi
We say that a block B satisfies T_{inv} if one of the following holds:
-
B is blue and B has a parent C such that C\notin B.BlueSet and C.past \cap X_B.AntiCone \ne \emptyset.
- B\notin X_{X_V}.Future
-
B has an invalid parent
Remark: As Yoni said before, rule 1. can be interpreted as (B does not have a parent C such that from the point of view of B, C should have been outcast).
Finally, we set N_p = 2M.
We now need to argue that the specification above satisfies the correctness definition.
Safety margin: We need to show that if a node considered a block invalid then all other nodes will consider it either invalid or outcast, but never valid. Assume B was invalidated by some node, then it must satisfy one of the three conditions above. If B satisfies 1. then it will be considered invalid by all nodes, as 1. only depends on the worldview of B. If B satisfies 2., then it is not in the future of X_{X_V} from the point of view of the node. It will hold for all other nodes that B is not in the future of X_V (as long as M is large enough, where “large enough” is in the order of minutes where we expect M to be set to the order of hours or days), so they will either invalidate or outcast it. If B satisfies 3. then it has an invalid parent, if that parent statisfies 3. it also has an invalid parent etc., if we follow the chain we eventually find C\in B.Past which was invalidated for either 1. or 2… If C satisfies 1. then all nodes will invalidate it and all of its offspring. If C satisfied 2. from the point of view of the current node, then by the argument above it will be considered either invalid or an outcast by all other nodes. The node which considers B valid must consider C an outcast, so there must be a block between B and C the node considers blue, but since none of the nodes will ever point to C, this could only happen in light of a 51% attacker.
Reconciliation: One of the nodes which consider B valid will shortly release a block which points to it, and will be considered blue by the entire network.
Prunability: True by design, we explicitly reject blocks which point outside X_{X_V}.future.
It remains to argue security. However, as we shall see, the protocol is not secure. In order to break security, an attacker must first create a disputed block considered outcast by some nodes and invalid by the others. This could be only done by publishing a block B pointing just above the pruning point (X_{X_B}) and hoping that by for some of the nodes, by the time they learn of B their chain grew enough so that it points below the pruning point. In order to complete the attack, the attacker must somehow create a block which will 1. has B in its past, 2. will be considered blue by a node which outcast B. Since no nodes considers this block valid, only the adversary could contribute to a chain which includes this block in its past. In general, this does not require a 51% attacker (because they don’t care if the node considers B blue). However, in our algorithm it might require 51%: let C be the first block considered blue by the node, and which has B in its past. It must be that the selected chain of C and B coincide very near the top (because otherwise C would not have been considered blue). If C has a parent C'\notin C.BlueSet with B\in C.Past then we are done, since necessarily B\notin X_C.past. Else, let C' be earliest block considered blue by C which has B in its past. Then again, since C' is considered blue by C, it follows that C' must be very close to C (C' can not be in the past of any block in the selected chain of C which was also in the selected chain before C was accepted, which [since the attacker is not 51%] are all of the elements in the selected chain of C up to a constant. If C' is too far off in the past of any of these blocks then we’d have that C'.Anticone\cap C.SelectedChain would contain too many blocks, contradicting the assumption that C'\in C.BlueSet). This implies that C' has a parent not considered blue by C, but which has B in its past. This is where we would like to say something along the lines “and this implies that C' should not have pointed at this parent, and thus C' is invalid”. The problem is that while C does not the parent blue, it could be that C' does. The attack is based on this.
Pruning attack on the algorithm: Assume for simplicity that the honest network has the form of a chain. Now assume an adversary creates a block B which points to a block X on the chain of depth 2M-1 from the point of view of the node N_1, and that by the time node N_2 learns of B, X is already of depth 2M+1. We get that N_1 accepted the block while N_2 rejected it. Now the attacker publishes a succession of blocks: each block points to the previous block, and also points at the honest chain k blocks above where the previous blocks pointed (with each block the attacker “climbs” k blocks up the honest chain, hence the name “climbing attack”). From the point of view of each block, both its parents are blue, and so there is no violation of condition 1… After sufficiently many blocks (about 2M/k) the attacker will finally create a block which has B in its past but that N_1 will consider blue, whereby successfully completing the attack. Note that an adversary can premine this attack: they can start at the current tip and keep creating blocks as described until the honest block pointed with the initial block is at the correct depth. An attacker with more than 1/(k+1) of the computational power can achieve this with a probability which is constant in M.