Well, that paper was chiefly a broad discussion of possible techniques. In Appendix B, “Efficient SPV proofs”, there is a concrete suggestion, although it is immediately qualified with the sentence “A detailed analysis of this problem and its possible solutions is out of scope for this document”. So, according to the authors, the 2014 paper does not actually advance any particular sidechain proposal for review.
Nonetheless, I will compare drivechain to their suggestion.
First, the sidechain challenge is to create a SPV-proof for side-to-main transfers. In other words, we must define conditions under which main-to-side payments can be re-unlocked on the mainchain.
Their suggestion involves two things:  add a second Merkle tree (to both chains) such that each block commits to all earlier blocks, and  to actually insert a big chunk of stuff into the bitcoin txn withdrawing the funds. This “big chunk of stuff” is the txn that was included in the sidechain, along with the sidechain’s block header, as well as many of the sidechain’s headers. In this way, they literally “SPV prove” the spend. I say “literally” because this idea strongly resembles the way an actual SPV wallet works: it downloads many headers, and checks these headers for valid work, and finally checks the given txn to see if it was included.
They then use some statistical logic to greatly reduce the number of headers needed, in general. This is a very cool trick, but since each header is ~80 bytes (or ~160 if merged mined), and since each must be selected somehow, the proof size ends up “in the tens of kilobytes range”. The midpoint of this phrase, 50 KB, is about 100 times the size of the the average Bitcoin txn, which (everything included) is about 0.5 Kb. So their proof requires a txn to be much larger.
What we do in drivechain is actually very similar, although it is simpler and better. We do use each Bitcoin block to commit to many earlier blocks…after a fashion. To do it, we keep a running cumulative total of the ‘ACK count’ for a txn. In this way, we get a proof that is much smaller, easier to compute, is robust to changes in difficulty, and is more consistent as it is not constantly changing size as a function of stochastic block hashes. Both proposals require a new Merkle tree, and both must contain the txn’s inputs and outputs. Instead of block headers (tens of KB), Drivechain requires that the TxID itself be included in a block (32 bytes), and then it requires an out-of-block message which might be as small as zero bytes per sidechain per block, or, worst case, is 6 bytes per sidechain per block. And, worst case, this message would be required in 100% of blocks for 3 months worth of blocks, for total size of 78840 (to prove 13140 blocks). So even in a highly unrealistic worst-case scenario, drivechain’s proof totals 78879 bytes which is also in the tens of kilobytes range. Best case it may total 39 bytes. And the outcome is not random – the worst case is only possible when miners run amok for no benefit at a cost to themselves. So it is reasonable to expect the very best case.
The security is also vastly superior in drivechain. Both approaches fail, and permit miner-theft if 51% attacked. However, drivechain holds up better under attack than the skiplist. This is mainly due to the drivechain’s slow transparent withdrawal process. In drivechain the attempts are delayed substantially before they are ACKed, and the ACKs themselves are slow. So it is impossible to conduct a surprise attack, impossible to harass the community writ large with many withdrawal attempts (ie the “mosquito strategy”), and very easy for users to understand that the attack is happening, long before it actually happens. Also, each attempt is very simple, one question of ‘Endorse’/’Reject’ – quite comparable to the the March 2013 anti-consensus event. Except that the March 2013 event was a sudden surprise, we did not know about it until after it had happened. Skiplist-sidechains would also work in this ‘surprising’ way, but in drivechain there are no surprises, because we are given many week’s worth of clear warning that the anti-consensus event will happen. Finally, a slow transparent process means that it is impossible for miners to attack the chain and claim that they didn’t know that they were doing so – with transparency, everyone knows, observers as well as the miners themselves. So it is a clearer demonstration of malice.
Both approaches attempt to solve the problem of extensibility – “extending” the capabilities of Bitcoin beyond those which currently exist.
This extensibility problem is a difficult one to solve, because of Bitcoin’s unique emphasis on “consensus” – that all users agree on the state of the blockchain. Since all users must agree, and agreement isn’t free, there is also an implicit agreement on a “minimum required effort” or “minimum tolerable workload”. Bitcoin plays by certain rules, and if those rules are to be meaningful they must be enforced as-written.
So we have a situation where  the rules (including “required effort” rules) must be enforced, but simultaneously one where  users might like to experiment with new rules. In short, we want the benefits of a “hard fork” (and of permissionless innovation) without paying the costs (which are a loss of consensus, or non-enforcement of important rules).
The trick is to try and solve both problems at once. A ‘hard fork’ solves only problem , and ‘doing nothing’ solves only problem . Extension blocks make some progress toward solving problem , at the expense of tremendous sacrifice on . This is because users of the non-extended original chain, are subject to a potential barrage of messages. These messages can be sent at any time, by anyone (including an attacker), and could take on any properties (large in size, difficult to process, slow to validate)…most important of all, invalid messages can be sent for free. In this scenario, the cost of maintaining consensus over “Original” is in great danger (according to some) of rising to “Original” + “Extension”, anyway. This means that the extension block is effectively a hard fork, and we have failed to solve challenge .
Instead, Drivechain condenses the from-extension-to-original messages into infrequent, easy to validate, unambiguous, chain-scale messages. It essentially flips the consensus threat on its head by arguing that the sidechain should do all of the consensus labor, and it should then present a tiny, minimal easy-to-verify proof of that labor to the mainchain at infrequent intervals. (In the sense of being “difficult to generate but easy to verify”, it resembles proof-of-work itself.) This allows us to solve problem  without compromising on .
This is why Adam Back in particular emphasizes the “slow return” feature of Drivechain, whenever possible (recall that Dr. Back was a major innovator and promoter of extension blocks in early 2014).
Again, to repeat the answer for extension blocks (above), the distinction between hard and soft forks isn’t the point.
The point is, instead, the burden placed on existing users. While an extension block does allow ‘oldtype nodes’ to ignore the extension data, it does this at a cost of no longer being able to fully-validate the block. It is a ‘backdoor hardfork’, of a kind, because users need to upgrade.
Imagine five different scenarios:
Keep in mind that, in order to use Bitcoin as money, every user must check every txn for double-spending. Therefore, if we narrowly assess each of these scenarios in terms of “the burden they place on existing users”, we get the following:
In my view, SHoM is too similar to an extension block. And it therefore lacks drivechain’s most important features.
I tweeted my thoughts on this article. I am happy that the authors worked on this, but I do not think that I can use it for anything.
Rollups pack a list of txns into a smaller amount of L1 space. Thus, they are a perfectly legitimate L2.
They have several drawbacks when compared to drivechain.
First: the benefits of rollups are much lower.
Rollup’s increase in onboarding capacity is capped. See an example of capped-ness here:
"The onchain transactions needed to open and settle (and occasionally rebalance) self-custodial Lightning channels take up a measureable amount of limited bitcoin block space. This block space footprint results in a hard upper limit on the number of self-custodial users who can be onboarded to Lightning in a given period of time. The additional transaction capacity enabled by validity rollups could be used to support more Lightning transactions ... For 2-P2WPKH-input-1-P2WSH-output-2-P2WPKH-output dual-funded channels, rollups can create room for up to 3.8x more Lightning channel open transactions."
In contrast, in Drivechain the onboarding growth factor is not limited to 3.8 – instead it is unlimited.
If rollups use an account model (vs utxo), their growth factor may be 10x or 100x more (ie, it may be 38x or 380x). But I have yet to see anyone describe, design, or code this.
Furthermore, rollups do not have as much flexibility as sidechains. (Sidechains have unlimited flexibility – everything in a rollup must in principle be writeable to L1, whereas sidechains are the reverse: everything experienced on the sidechain must be in principle ignorable on L1.)
Second, rollups require a big change to L1: L1 must validate zk-snarks. Bip300 is just an integer that counts from 1 to 13,150, which is something that anyone can understand and audit. Zk-stuff is rightly called “spooky moon math” and most experts are (or were) confounded by it (see here and here). The average person has zero chance of ever grasping the difference between a zk-proof system that is pretending to work (vs one that is working genuinely). You might say: so much the worse, for the average person! Rightly so, but “most L1 node runners” also have zero chance of understanding or auditing these systems. Nor does the economic center of gravity of the Bitcoin system. In contrast, things like hash functions and signatures are simple operations that a user can perform for themselves, many times – thus they can learn the basics and “audit” their computer.
Third, rollups do nothing to solve the “data availability problem”. Drivechain does not solve it either… but Drivechain is at least designed with this DA failure mode in mind. To marginally address DA, Drivechain rewards L1 miners with txn fees (via merged mining); and rewards L2 users (via useful services). Rollups are often presented as though they are impervious to failure. But really: DA is where the rubber meets the road, and rollups do nothing about this big problem.
Fourth, despite the above limitations, the main “advantage” that rollups have over DC, is very very small. The advantage is: the supposed benefit that “51% miners cannot steal from” rollups. Firstly, this comparison is weak, because in DC an actual theft requires 6 months of open, easily-demonstrated misbehavior. So DC theft is enormously impractical – like robbing Fort Knox in slow motion. Secondly, in the rollup case, if evil miners are determined to steal (from rollups), then they can also spend six months doing something comparable: refuse to allow the L1 zk-snark message into the L1 blockchain. This holds the rollup funds hostage – miners can refuse to allow rollup-withdrawals, unless desperate users sell their coins to the miners for pennies on the dollar. If miners start this on Jan 1, likely that many users will have given up by July 1. So the main advantage rollups have over DC is not significant.
Fifth, the “advantage” in point four is (yet again) just a misunderstanding of the DC “miners can steal” problem. “Miners can steal” is not a bug, it is a feature (for DC). See the long presentation on “Sidechain Privatization”, if you want to be one of the very few people who understand why. Not that it matters much in this case, since rollups are also not flexible or general purpose enough to cause too much inter-chain damage.