A proposal for finality in GHOSTDAG

Finality

In a previous post I tried to justify the employment of finality in a POW based system. In this post I want to highlight a subtlety with respect to finality in DAGs.

Since GHOSTDAG operates as a chain-selection rule in a DAG, we can apply a simple finality rule of the form reject block B if it leads to a reorg deeper than M blocks. Under a bounded network delay, the probability that the honest network would split (=some nodes accepting a block, others rejecting it) is exponentially small in M; this is immediate from the Safety and Liveness properties.

Prunability

There is a other consideration in practice – we need to achieve Prunability, namely, the ability to prune historical blocks. Pruning requires a separate discussion, and it may mean different things to different people or in different contexts, so let’s be precise: We want to prevent a situation where a new (attacker) block references blocks from the old past as its direct predecessors, requiring full nodes to fetch them either from the disk or worse yet from non-pruning nodes.

In a DAG, this is not a trivial task. A strawman protocol would be to employ simple rules of the form a valid block may not point at blocks that are not in the future of the most recent N blocks. This (as well as many variants) is broken, since honest nodes require a few seconds to reconcile their world views, and an attacker can exploit this and publish an old block just when the threshold has past for one honest nodes but not for the other (this requires luck, and very strong communication capabilities, but does not require a large hashrate). The gist of the problem is that it is impossible to have all nodes agree on the exact time that the finality threshold has passed, and that if we use the DAG structure / POW consensus to decide this, we enter into a circular argumentation where the consensus both dictates the validity of blocks and these blocks define the consensus.

I would like to suggest a solution to this problem, and to get some feedback on it, as it is a delicate topic with potential to break the consistency of the system. I’ll propose the solution in the GHOSTDAG terminology, but it is probably translatable to other DAG-POW systems as well. In GHOSTDAG we have blue and red blocks, and (with high probability) these represent blocks of honest and malicious nodes.

The idea is to allow blue blocks, mined in parallel, to add to the DAG blocks from the far past (outside the finality window, but not outside the prunability point). The set of blocks that can be published, from the far past, thus closes fast, since blue blocks must eventually (and quite quickly so) be built atop one another and agree on the past’s threshold.

Formal prunability rules

Let G be a DAG observed by this node, and let B be a new block (B\notin G). this considers B to be invalid if one of the following conditions holds:

(i) \exists C \in B.parents such that C \notin Blues(G(B)) AND C.past\cap X.anticone\neq\emptyset, where X is the latest block in the chain of G(B) with X.blueScore<B.blueScore-M, and where G(B) is the subDAG B.past\cup\left\{B\right\}.

Informally: B merges a block that is outside B's latest finality point, and the tip of the merging path is not blue (in B's view).

This condition depends only on B and not on G (it is thus structural).

(ii) B\notin Z.future, where Z is the latest block in the chain of G(X) with Z.blueScore<X.blueScore-M, and where X is the latest block in the chain of G with X.blueScore<G.virtual.selectedParent.blueScore-M.

Informally: B is below the pruning point defined by G.

This condition depends on G (it is thus view-dependent).

(iii) \exists B'\in B.past such that B' is invalid. This condition depends on G because rule (ii) depends on G (it is thus view-dependent).

Remarks

a. The fact that rule (i) is structure-dependent ensures that there is no circular reasoning: The status of a block in the consensus (being red or blue, being on the chain or not) is irrelevant to the satisfaction or violation of the condition.

b. I argue that as long as we know that B is red, and as long as there is no blue block A that points at B, we can skip checking the satisfaction or violation of rule (i) (essentially, we do not care if it is invalid or it is valid and should not be pointed at).

c. It is interesting to modify rule (i) to define X as the latest block in the chain of G, rather than that of G(B) (it is still vital that the redness of C be calculated from G(B)'s perspective, to avoid circular reasoning which will lead to a netsplit). The advantage of this modification is ease of computation of X.

(Hi, after much deliberation I realized that there exists a pruning attack on this algorithm. This was a result of trying to better formalize prunability and its security. The post below was written in an attempt to prove Yoni right but ended up in finding a subtle weakness, so it was edited accordingly. If you just want to read about the attack you can skip to the end)

I want to suggest a slightly more abstract framework to reason about the types of pruning algorithms we are interested in. I do this for the following reasons:

  • It will make the proposal, and future proposals, much easier to understand and reason about
  • It affords a straightforward definition of a pruning attack
  • It allows me to cleanly define the correctness and safety of a prunability algorithm (I do not call it a protocol as it does require any communication between the peers) in a way which is independent of any particular proposal, and of the consensus protocol
  • It might be easier to implement

However, this post is only half formal, both in order to make it accessible and because the ideas therein are not fully formed.

In this formalism, each node maintains a subset of the DAG \chi called the outcast set. These are blocks which the nodes includes in the DAG, but does not point to from any blocks it creates.

A prunability algorithm is the defined by specifying the following three:

  • T_{inv}: a condition for invalidating blocks,
  • T_{\chi}: a condition for adding a newly discovered block to \chi
  • N_p: a positive integer called the pruning parameter

The operation of an honest node upon learning of a new block B will then be:

  • If T_{inv}(B) holds discard the block, else
  • If T_{\chi}(B) holds, add the block to \chi
  • If the block was not discarded (whether it was added to \chi or not), add it to the DAG and broadcast it to peers

We call a block invalid, outcast or valid respectively. Note that the status of a block is not in consensus (e.g. one node may consider a block valid while the other considers it outcast, etc.).

When constructing a block, it should point to all current tips which are not in \chi.

This allows us to formally define the desired properties of a pruning algorithm.

Definition (informal): A pruning attack is a scenario where an honest node creates blocks whose past includes blocks which are invalid by other honest nodes.

Essentially, in a pruning attack, an adversary creates a “borderline” block which will only be accepted by some of the nodes, and manipulates the nodes who accepted the rotten block to mine above it, whereby rendering their blocks invalid from the point of view of the rest of the network. The consequences depend on the actual algorithm. In the best case they make it cheap to manipulate nodes into wasting hash rate, in the worst case they can cause a permanent hard fork.

Definition (informal): A pruning algorithm is correct and secure if all of the following holds for any block B and two honest nodes N_1, N_2 (barring a 51% attacker):

  • (safety margin) There will never exist a block B such that N_1 thinks B is valid and N_2 thinks that B is invalid
  • (reconciliation) If N_1 thinks B is valid and N_2 thinks B is outcast then, shortly after, a block B' will emerge that has B in its past and that N_1 will consider valid
  • (security) It is impossible to carry out a successful pruning attack
  • (prunability) Let X be the most recent block in the selected of Virt with X.BlueScore < Virt.BlueScore - N_p, then any block which points to a block which is not in X.Future is invalid
    Remark: In order to make the above formal, one should first prefix all deterministic properties other than prunability with “with overwhelming probability”, and then define what “shortly after” exactly means for reconciliation.

Yoni’s suggestion above could be recast in the outcast formalism. The algorithm is parameterized by a positive integer M.

For any block B let X_B be the most recent block in the selected chain of B such that X_B.BlueScore < B.BlueScore - M. Let V be the virtual block.

We say that a block B satisfies T_{\chi} if one of the following holds:

  1. B has a parent which is not in X_V.Future
  2. B is red and has a parent in \chi

We say that a block B satisfies T_{inv} if one of the following holds:

  1. B is blue and B has a parent C such that C\notin B.BlueSet and C.past \cap X_B.AntiCone \ne \emptyset.
  2. B\notin X_{X_V}.Future
  3. B has an invalid parent

Remark: As Yoni said before, rule 1. can be interpreted as (B does not have a parent C such that from the point of view of B, C should have been outcast).

Finally, we set N_p = 2M.

We now need to argue that the specification above satisfies the correctness definition.

Safety margin: We need to show that if a node considered a block invalid then all other nodes will consider it either invalid or outcast, but never valid. Assume B was invalidated by some node, then it must satisfy one of the three conditions above. If B satisfies 1. then it will be considered invalid by all nodes, as 1. only depends on the worldview of B. If B satisfies 2., then it is not in the future of X_{X_V} from the point of view of the node. It will hold for all other nodes that B is not in the future of X_V (as long as M is large enough, where “large enough” is in the order of minutes where we expect M to be set to the order of hours or days), so they will either invalidate or outcast it. If B satisfies 3. then it has an invalid parent, if that parent statisfies 3. it also has an invalid parent etc., if we follow the chain we eventually find C\in B.Past which was invalidated for either 1. or 2… If C satisfies 1. then all nodes will invalidate it and all of its offspring. If C satisfied 2. from the point of view of the current node, then by the argument above it will be considered either invalid or an outcast by all other nodes. The node which considers B valid must consider C an outcast, so there must be a block between B and C the node considers blue, but since none of the nodes will ever point to C, this could only happen in light of a 51% attacker.

Reconciliation: One of the nodes which consider B valid will shortly release a block which points to it, and will be considered blue by the entire network.

Prunability: True by design, we explicitly reject blocks which point outside X_{X_V}.future.

It remains to argue security. However, as we shall see, the protocol is not secure. In order to break security, an attacker must first create a disputed block considered outcast by some nodes and invalid by the others. This could be only done by publishing a block B pointing just above the pruning point (X_{X_B}) and hoping that by for some of the nodes, by the time they learn of B their chain grew enough so that it points below the pruning point. In order to complete the attack, the attacker must somehow create a block which will 1. has B in its past, 2. will be considered blue by a node which outcast B. Since no nodes considers this block valid, only the adversary could contribute to a chain which includes this block in its past. In general, this does not require a 51% attacker (because they don’t care if the node considers B blue). However, in our algorithm it might require 51%: let C be the first block considered blue by the node, and which has B in its past. It must be that the selected chain of C and B coincide very near the top (because otherwise C would not have been considered blue). If C has a parent C'\notin C.BlueSet with B\in C.Past then we are done, since necessarily B\notin X_C.past. Else, let C' be earliest block considered blue by C which has B in its past. Then again, since C' is considered blue by C, it follows that C' must be very close to C (C' can not be in the past of any block in the selected chain of C which was also in the selected chain before C was accepted, which [since the attacker is not 51%] are all of the elements in the selected chain of C up to a constant. If C' is too far off in the past of any of these blocks then we’d have that C'.Anticone\cap C.SelectedChain would contain too many blocks, contradicting the assumption that C'\in C.BlueSet). This implies that C' has a parent not considered blue by C, but which has B in its past. This is where we would like to say something along the lines “and this implies that C' should not have pointed at this parent, and thus C' is invalid”. The problem is that while C does not the parent blue, it could be that C' does. The attack is based on this.

Pruning attack on the algorithm: Assume for simplicity that the honest network has the form of a chain. Now assume an adversary creates a block B which points to a block X on the chain of depth 2M-1 from the point of view of the node N_1, and that by the time node N_2 learns of B, X is already of depth 2M+1. We get that N_1 accepted the block while N_2 rejected it. Now the attacker publishes a succession of blocks: each block points to the previous block, and also points at the honest chain k blocks above where the previous blocks pointed (with each block the attacker “climbs” k blocks up the honest chain, hence the name “climbing attack”). From the point of view of each block, both its parents are blue, and so there is no violation of condition 1… After sufficiently many blocks (about 2M/k) the attacker will finally create a block which has B in its past but that N_1 will consider blue, whereby successfully completing the attack. Note that an adversary can premine this attack: they can start at the current tip and keep creating blocks as described until the honest block pointed with the initial block is at the correct depth. An attacker with more than 1/(k+1) of the computational power can achieve this with a probability which is constant in M.

A couple observations:

  1. The attack described might not be viable if the finality window is a non-sliding window, and the window movement is significantly larger than k (an actual lower-bound can be calculated) (ignoring other reasons why sliding/non-sliding windows are preferable):
    Once the finality window moved, no nodes will accept the attacker’s succession of blocks.

  2. (Implementation details):
    Currently finality is checked before Phantom runs.
    This allows us to reject blocks prior to incorporating them inside the DAG, greatly improving resilience against DOS through invalid blocks.
    Therefore conditions such as “if B is blue” are not desirable.
    Needless to say - security of the system is more important, but this should be taken into consideration when considering various alternatives.

Re 1, AFAIK the attack is viable even against a static finality window—there’re still a situation where some nodes’ DAG is just below the threshold and some nodes’ just after, and the attacker can publish its blocks at that very moment and thereby split the network

Not if we adopt the suggested “outcasts” system.
The block will be accepted by some, rejected by others, but no honest node will accept the blocks mined on top of it.

I don’t see how this is possible. In general, if the condition for a block being valid depends on the global graph structure, one could conceivably use it to create a disputed block and then use a climbing attack to de-outcast it at the nodes which accepted it.

Here, the condition is whatever triggers the next finality window.

In general, as said above, the two crucial properties we want to seek simultaneously are:

  • If a block is valid by some and outcast by others, then the nodes who consider it valid will somehow get the other nodes to consider blocks they mine above it valid.
  • If a block is outcast by some, and invalid by others, then it is impossible to create a block above it that the outcasting node will consider valid

There is an obvious tension between these properties: one enables including outcast blocks in the past of new blocks wheres the other prohibits it. The only beacon of hope is that in the former case it is the honest network who wants to include the outcast block, and in the latter it is the adversary. The only thing breaking the symmetry between them is the computational power (and even here, we swept under the rug the fact that only some of the network participates is participating. There are subtle arguments here which we really should flesh out if we adopt this approach). So what we are essentially trying to do is to make it so you must have considerable computational power to de-outcast a block.

The problem here is that all approaches we tried are either too weak (in the sense that a weak adversary can de-outcast a block, e.g. via a climbing attack), or too strong (in the sense that even the honest network can’t de-outcast a block).

In the optimistic scenario, a middle way is very difficult to achieve (though I suspect it is provably impossible. At least in the setting where the condition depends entirely on the current graph structure).

Also, I find it hard to believe any viable approach could be implemented without using PHANTOM. The fact that a block is considered blue by the virtual network is pretty much the only credential that a block which creates a non trivial reorg (e.g. one where the previous selected tip is no longer considered blue) was created by the honest network.

I admit that almost all our efforts were devoted to starting from a weak approach and trying to incrementally strengthen it. Maybe the opposite approach will lead to interesting proposals.

Regarding your point 2., we actually have a sort of scale to how “problematic” a piece of data we rely on is. From best to worse we have

  1. Anything we know before the block was accepted
  2. Whether the new block is in the blue set of the virtual node
  3. The selected chain of the new block
  4. The blue score/set of the new block, if it is considered blue
  5. The blue score/set of the new block, if it is considered red

When designing an approach, we also try to make the data it requires be as least problematic as possible. If you have any comments about the list above, or want to add anything to it, let us know.

Finally, I would like to present an idea I had, to demonstrate that moving to a less safe paradigm might solve the problem. This is not a proposal, this might be a really bad idea, I’m only presenting it as food for thought.

The idea is essentially that, when we learn that a block is outcast, we wait a little bit and then check again. If it remains outcast, we add it to the outcast list, otherwise, we consider it invalid.

There should be a maximal time X that we wait (e.g. X=2D_{max}). Every outcast block B should get an outcast score OS(B)\in (0,1) and the time we wait is X\cdot OS(B).

The point is that the only blocks for which we wait a lot of time are blocks that are on the verge of being invalid. If it is because they point just above the threshold, then after waiting they will point below the threshold and be invalidated. If it is because they point at someone just above the threshold, after the wait they are pointing to someone invalid and well hence be invalidated themselves. This prohibits a climbing attack.

Note that this is not a complete algorithm, but a way to augment an existing algorithm (relying on an access to a stopwatch, which is what makes it outside the paradigm we considered so far).

@hashdag please note this reply was edited to contain more than the gibberish it held when you hearted it

1 Like

About 1: Though this is a desirable property, I’d argue this is not necessarily a requirement.
Lack of such property hurts in the same place finality hurts: at certain conditions consensus can not be over-ridden if new facts are introduced post-factum. This is an unfortunate result, but deemed as necassary.

Now, the reason we introduced outcasts was to protect ourselves from netsplit attacks. 2 provides this guarantee.
To offset the way 1 was hurt - we can be more lax with the size of the finality window.

Condition 1. is an absolute necessity.
Assume a portion of the nodes validate a block B whereas the rest outcast it. If 1. does not hold, then all of the former nodes will mine over B so all of the latter nodes will not point at blocks created by the former nodes. This will soon make all of the blocks in the weaker of the two halves perpetually red.

Assume even that we are using a fixed window policy, where the outcast set is cleared upon reaching a window. This still implies that the attacker can make a portion of the network (which could be as large as 49%) not to participate at the cost of mining a single block.

I don’t know how you suggest to scale the finality window, but a setting where a single block can, say, exclude %20 of the network for 5 minutes, in my opinion, is unacceptable.

By the way, if we allow such things it becomes easy to “fix” Yoni’s approach. To de-outcast a block you need to create a blue block which has the outcast block in its past, and which is not only blue, but its selected chain goes through the block in the selected chain of the virtual in depth say 50 (or any blue block in its anticone).

What killed that proposal is exactly that all blocks created by the weaker side of the split and until the outcast reached a depth of more than 50 will become red. So again, a single block invalidates several minutes of honest work.

OK, I think I see your point.

The attack described by @Deshe is a valid one. Perhaps it can be solved by adding a fourth rule that explicitly disallows such succession of blocks as described in the attack?

That is, add a fourth condition for the invalidity of a block B:

(iv) There exists a sequence of blocks \mathcal{C}=\left(X_0,X_1,X_2,\dots\right), where X_0=B, \forall i: X_i\in X_{i-1}.blues, and |\mathcal{C}\setminus B.blues| >L.

Informally: There exists too long a path of blocks, outside B's own blue set, where each block in the path is in the blue set of the previous one.

Of course, even if it solves this attack, we need to make sure it doesn’t break consensus in a different way.

Another way to enforce (iv) is to simply and more generally disallow merging too many blocks, that is:

(iv*) |B.selectedParent.anticone \cap B.past|>L.

This rule is stronger than (iv) and implies it—if the anticone of the selected parent of B is limited by L, it implies in particular that a sequence (X_i) as described in (iv) cannot have more than L blocks outside B.selectedParent's past.

Remarks

(1) There are also practical data-structure considerations to impose such a limit as in (iv*), to ease the maintenance of the DAG.

(2) In principle, the introduction of such a rule might jeopardize the network’s ability to merge post a split, as now when two chains C_1 and C_2 are competing, a node that has been made aware of both (that is, after the split ended) cannot create a block that points at the tip of both chains, if the weaker one (C_2, say) is longer than L. However, note that it can still merge L (or, L-k) blocks from C_2, and so practically all the honest miners will keep developing C_1 as their main chain and at the same time will slowly merge L blocks at a time from the C_2 chain.

For convenience, I’ll write below the entire proposed set of prunability rules, including the top three that are unchanged:

Updated formal prunability rules

Let G be a DAG observed by this node, and let B be a new block (B\notin G). this considers B to be invalid if one of the following conditions holds:

(i) \exists C \in B.parents such that C \notin Blues(G(B)) AND C.past\cap X.anticone\neq\emptyset, where X is the latest block in the chain of G(B) with X.blueScore<B.blueScore-M, and where G(B) is the subDAG B.past\cup\left\{B\right\}.

Informally: B merges a block that is outside B's latest finality point, and the tip of the merging path is not blue (in B's view).

This condition depends only on B and not on G (it is thus structural).

(ii) B\notin Z.future, where Z is the latest block in the chain of G(X) with Z.blueScore<X.blueScore-M, and where X is the latest block in the chain of G with X.blueScore<G.virtual.selectedParent.blueScore-M.

Informally: B is below the pruning point defined by G.

This condition depends on G (it is thus view-dependent).

(iii) \exists B'\in B.past such that B' is invalid. This condition depends on G because rule (ii) depends on G (it is thus view-dependent).

(iv*) |B.selectedParent.anticone\cap B.past|>L.

After discussing with @Deshe and @msutton and @elichai2, here’s a much simpler way to present stuff:

Updated updated formal prunability rules

Let B be a block. B is considered invalid if one of the following conditions holds:

(*) \exists C \in B.parents such that C \notin Blues(G(B)) AND C.past\cap X.anticone\neq\emptyset, where X is the latest block in the chain of G(B) with X.blueScore<B.blueScore-M, and where G(B) is the subDAG B.past\cup\left\{B\right\}.

Informally: B merges a block that is outside B's latest tentative-pruning point, and the tip of the merging path is not blue (in B's view).

(**) |B.selectedParent.anticone\cap B.past|>L.

Claim: Let Z be the earliest block in B.past with B.blueScore-Z.blueScore>N, for N=M+L+k. If B is a valid block, then B.past\cap Z.anticone=\emptyset.

Corollary: Let B and Z be as in the claim. If B is finalized in the chain of the DAG G held by this node, this can safely prune all blocks (data and headers) that are in the past of Z, and reject any new block (i.e., any future block not in the current G) that is not in the future of Z.

Edit: As explicated by @Deshe below, a valid block must have only valid blocks in its past. I omitted it as it is not unique to pruning, it applies to any validity rule in a blockDAG.

Before I delve into attempting to prove the claim, a few points:

  • First, I just want to explicate the implicit assumption that a block is invalid if it has an invalid parent.
  • Regarding rule (**), one should note that it is equivalent to the more readily calculable inequality B.pastSize - B.selectedParent.pastSize > L .
  • L could be thought of as controlling the convergence rate of the network following a split. The main chain could only include L blocks at a time.
  • As @msutton pointed out in the discussion, it makes sense to choose k^2 < L < \left(\frac{M}{k} -k\right). The lower bound is to allow the network to seamlessly merge from scenarios where an anticone of size \le k^2 is accumulated, the upper bound is to protect from a climbing attack.