Welcome! This is part two of our deep dive into the technologies powering blockchains and crypto. If you haven’t had a chance to read part one, I highly recommend you do so before diving into this post! Several components of this article will assume an understanding of topics discussed last week, including public key cryptography, hash functions, and more.
If you want to keep up with the rest of our series, hit that subscribe button!
Blockchain 101
Last week, we discussed cryptography and learned about some of the core mechanisms that secure blockchains. This week, we’ll discuss the implementation of those mechanisms, how blockchains are structured, and some of the key concepts that arise from this structure. We’ve got a lot to cover, so let’s dive in!
So, what is a blockchain?
Put simply, blockchains are databases that keep a permanent record of transactions. Importantly, these databases don’t require any central authority to maintain or secure them. Why is this important?
Imagine you’re an Instagram influencer that makes $100,000 a year off of your following. To run your business, you rely on a central authority (Facebook) to maintain your account and ensure you can reach your following. As a result, you have to recognize this authority also has the power to disable your account, delete it, flag your posts, or do any number of anti-competitive things that could have an adverse effect on your influencer business (see Twitter censorship over the past several years). This hasn’t caused any major issues yet, but it isn’t necessarily ideal — especially if we’re trying to create an open financial system. Imagine if you could just be “cut out” of using a credit card or withdrawling from an ATM…
As a result, blockchains aim to operate as “trustless” transactional systems, allowing users to make peer-to-peer transactions without needing to trust a central authority or each other. How is this possible? To understand, we’ll have to dive into a couple of concepts:
Blockchain Data Storage Structure
Byzantine Fault Tolerance
Be warned – we will get a little technical in this post. If you’d prefer to keep things high-level, I highly recommend this article. On the other hand, if you’re looking for some more detail, read on.
Blockchain Data Storage
From a high level, we can think of blockchains as a database storing a list of transactions. Within this database, every transaction is linked to the transaction immediately before it through a cryptographic hash. We discussed hashes at length last week, but as a reminder, hashes are mathematical functions that convert an input of arbitrary size into an encrypted, standardized output. More importantly, hashes are deterministic, meaning given any input, the function will always spit out the same output, and any adjustment to that input will result in a completely different output. This characteristic is key to transaction permanence, and we’ll come back to it in a second.
First, I want to clarify that while it’s easy to think of blockchains as a simple “list” of transactions, that isn’t entirely true. Instead, transactions within a blockchain are stored in something called a block.
Blocks are data structures that store a few key items — namely, a batch of transactions, a timestamp from the block’s creation, and the previous block's hash (as well as other, more technical items). Because the creation of each block requires the previous block’s hash, the blocks form a linear “chain” of blocks, hence the name “blockchain.” To better understand, let’s use an example.
Let’s say a block can hold 100 transactions, and the network has processed 1,000 transactions to date. To process these 1,000 transactions, the network would first create a special “genesis block,” holding the first 100 transactions, a timestamp for the block, and the block’s hash (made up of a combination of all of these pieces of data). Once the genesis block’s hash is determined, the network would start creating a second block, which holds another 100 transactions, a timestamp for the block, and the genesis block's hash. Following this, the network would create a third block using the hash from the second, and so on until ten blocks existed. Furthermore, once this chain existed, the transactions within the first nine blocks would become “permanent.” Why? Because hash functions are incredibly sensitive and changing even one transaction in a block completely alters its hash value. For example, if block 5 had a transaction adjustment, its hash value would completely change, causing an adjustment in block 6’s hash and a cascade of changes down the rest of the chain.
To make a long story short, the use of hash functions within blocks and the Avalanche effect that occurs when a transaction is adjusted allows blockchains to achieve transaction finality within a specific chain.
Unfortunately, this only solves one piece of the puzzle.
Imagine we have a dishonest actor who decides to adjust a transaction and create a new chain. Now imagine we have 100 of these actors, all adjusting transactions at different moments in time, creating hundreds of potential branches of the original chain. How can we determine which chain is the “real” chain? For decades, this has been a topic of debate through something known as the Byzantine General’s Problem.
The Byzantine General’s Problem
The Byzantine General’s Problem is a game theory concept that describes a group of isolated actors and their difficulty in agreeing on the solution to a problem, given a potentially incorrect dataset. The problem has two main stipulations:
Each actor holds a critical piece of information that must be communicated to the other actors for the group to arrive at the correct conclusion.
Each actor is siloed from the others, and while they can communicate, it is impossible to be 100% confident that any given communication is valid.
Great… but what does that mean? Don’t worry — we’re going to dive into an example.
Imagine a situation in which four generals, each commanding separate armies, surround a city. Each general occupies one side of the city, and to take the city, all the generals need to ensure they attack at either dawn, noon, or dusk. The choice of dawn, noon, or dusk does not matter, just that all four generals agree on the same of the three options. In this hypothetical scenario, the generals can ONLY communicate when to attack using on-foot messengers who run between the armies. Furthermore, the enemy is cunning and has proven it can corrupt the messengers and force them to relay an incorrect time. Knowing this is the case, what can the generals do to be certain they will all attack simultaneously? A system that can solve this problem is Byzantine Fault Tolerant (BFT).
Blockchains deal with a version of the Byzantine General’s Problem, but instead of generals and an attack time, blockchains have validating nodes and a chain state (or the “real” chain). For simplicity’s sake, we’re going to call a validating node a computer system that holds a copy of the blockchain, validates transactions, and creates new blocks. 1
At any given point, a blockchain may have 100, 1,000, or 10,000 nodes, all separately attempting to create new blocks. Additionally, when a node successfully creates a new block, it must ensure it appends that block to the “right” chain to receive a reward for its work. Going back to our example, remember we must assume these nodes could act maliciously. Any one of them could adjust a transaction, create a new chain and try and convince other nodes their “fake” chain is, in fact, the “real” chain. With this in mind, we conclude that a secure blockchain must be Byzantine Fault Tolerant or able to tolerate a certain number of dishonest nodes. Bitcoin and its Proof of Work consensus mechanism was the first decentralized system to achieve this. Effectively, it stated:
To create a valid block, a node must find an input (known as a nonce) that results in a hash output specified by the network. 2 If the hash output does not match the network’s specifications, the block is invalid and will not be accepted by other nodes. The only effective method for finding the nonce is a brute-force attack, which requires time and electricity. This process is known as mining.
Furthermore, the mechanism stated that the “real” chain would always be the chain on which the most computing power has been spent (in simple English, the real chain is always the longest chain).
With that in mind, we can finally combine all the pieces to understand how Bitcoin and other blockchains allow for decentralized transaction permanence.
The network stores all transactions in blocks, which are chained together in a time-based fashion using hash functions. These functions make it so that when a transaction in the chain is adjusted, every block following that transaction is also adjusted.
All valid blocks require a specific hash specified by the system. Because adjusting a transaction results in adjusting those hash values, all blocks following an adjustment become invalid and must be recalculated. In other words, the attacker would need to recompute the correct nonce for every block following the one in which a transaction was altered.
Finally, the system always determines the longest chain to be the “real” chain. To become the longest chain after adjusting a transaction, the attacker must first spend time recreating all invalidated blocks. During this time, honest nodes continue adding to the “real” chain at a rate equal to the combined computing power of the entire network. This means to catch up to the “real” chain, the attacker must somehow command more computing power than the rest of the network combined.3
This combination of mechanisms was Bitcoin’s key innovation and has since been copied, implemented, and experimented on by hundreds of blockchains. Several additional consensus mechanisms have popped up since the advent of proof of work, including Proof of Stake and Delegated Proof of Stake. We don’t have time to dive into these this week, but we’ll tackle them in next week’s lesson on blockchain scaling. Some less common consensus mechanisms can be found here.
Worth noting, the combination of features employed in Proof of Work is robust but not full-proof. For example, imagine a situation in which an attacker decides to adjust a transaction in the previously confirmed block. If the attacker got lucky, they might quickly guess the nonce and immediately begin working on the longest chain. On the other hand, if 10, 20, or 50 blocks need to be recalculated, the chances of a lucky break become infinitely small.4 As a result, many market participants – centralized exchanges, OTC deals, etc. – require the creation of anywhere from 3 to 10 blocks before stating a transaction was “valid.”
Now that we fundamentally understand how blockchains operate, we need to dive into the actual process of adding a new transaction to a block. If you decide to send money to your friend, where does that transaction “go”? Who gets to control when or how it is added to a block? What happens if your transaction is never added to the chain?
The Mempool
When a user submits a transaction to a blockchain network, it does not go immediately into a block. Instead, it first enters the Mempool, a sort of “waiting room” for transactions that are not yet validated. When a transaction enters this space, the first node to receive it does a few things:
It verifies the wallet or account that was used to initiate the transaction.
It verifies the transaction's output doesn’t exceed the input (ie. if 5 BTC leaves a wallet, no more than 5 BTC can be delivered to the recipient of that transaction).
Finally, it verifies that the funds have not already been spent in another transaction.
If any of these conditions fail, the transaction is considered invalid and is rejected. Assuming they pass, the node broadcasts the transaction to other nodes, which repeat the same process. Eventually, once a transaction has propagated through enough of the network, it will reach a node that found the correct nonce and has the right to create a new block. When this occurs, the transaction is removed from the mempool and placed in a block.
In summary, the mempool is effectively a buffer zone that a blockchain uses to ensure all transactions are valid and reduce reliance on any single node for transaction inclusion. Additionally, the mempool can provide key insights into a given chain, such as demand for transaction inclusion, the number of active market participants, and potentially profitable opportunities for savvy market participants. Conveniently, this leads us to the last piece of the puzzle – transaction ordering within a block.
Transaction Ordering and Maximal Extracted Value (MEV)
From a base level, transactions in a blockchain are ordered based on the gas fee (or transaction fee) the user is willing to pay. The math here is simple – if one transaction offers a $2 fee to be added into a block, it will be added before a transaction offering a $1 fee. This is how BTC and most store-of-value chains operate. Unfortunately, for application layer chains like ETH, it isn’t that simple.
Imagine you used a blockchain to take out a loan against your ETH when suddenly, you notice your ETH will be liquidated in the upcoming block due to an unhealthy loan. If you can place a transaction paying back your loan, and have it execute before the liquidation transaction, you can effectively cancel the liquidation. This provides a monetary value to you, i.e. not having your collateral liquidated at a discount. As a result, you should be willing to pay a premium on your transaction to ensure the miner structures their block in this manner.
Furthermore, you must also recognize the miner can add transactions into their own block. In this example, the miner has two options. It could either take the premium from you (and save you from liquidation) or insert its own transaction into the block and take advantage of your situation. As a rational market actor, the miner will likely analyze these two options and choose the one which makes them the most money. This game, and the premium the miner can extract from either side of this liquidation, is known as Maximal Extracted Value (MEV).
MEV is a relatively complex topic, but in summary, it can be thought of as a new type of arbitrage, specifically within blockchains. In most blockchain systems, the value generated by MEV is extracted from blockchain users and given to two parties:
Miners: Miners generate value from MEV due to their right to include, exclude, and order transactions within their blocks. Miner MEV profit comes from gas fees and the implementation of specific transactions into their blocks.
Searchers: Searchers generate value from MEV by using bots to monitor the mempool and current blocks. When a bot notices a potentially profitable opportunity, it puts together a batch of transactions and sends them into the mempool with a high gas / transaction fee. Searchers generate MEV profit, calculated as the difference between their transaction revenue and the gas paid to ensure the transaction goes through.
MEV extraction comes in many forms and is commonly called the “invisible tax” of blockchains. One common form of MEV extraction is known as a sandwich attack. We’ll use one final example to understand sandwich attacks:
Imagine you, a savvy market participant, notice a large transaction in the mempool to purchase $10M of ETH from a decentralized exchange (DEX). Due to your savviness, you recognize this transaction will increase the price of ETH by roughly 1.5% within this specific DEX. Therefore, you submit two new transactions to the same DEX to profit from this opportunity: one purchasing ETH and one selling ETH. Utilizing knowledge of current fees and gas limits, you can set up the transactions such that the buy transaction executes just before the $10M purchase and the sale transaction executes just after the $10M purchase. When done correctly, these transactions all occur in the same block, meaning you take no price risk on the trade. As a result, you secure the 1.5% profit caused by the large transaction at the expense of the original user.
There are several other profitable MEV opportunities, but we will not dive into those this week. However, if you’re interested in learning more about MEV, this post and this post are great places to start.
Let’s Review
We covered a lot of complex material today, so let’s review. At its base level, you can think of a blockchain as a growing list of transactions (or a ledger) between “user accounts,” known as wallets. The transactions within this list are stored in a data structure known as a block, which is time-stamped and cryptographically linked to other blocks in chronological order. Creating a block takes time and energy, and adjusting a block forces you to spend that time and energy all over again. Furthermore, to have the network consider said adjustment valid, it has to be a part of the “longest chain,” which becomes exponentially more difficult as new blocks are created. Finally, miners add transactions into their blocks by pulling from the mempool, and they order those transactions based on the most profitable opportunities the ordering presents.5
That’s it for this week! Below, you’ll find a list of reference materials explaining some of the topics we discussed, as well as tangential topics we didn’t discuss. Next week, we’ll dive into blockchain scaling. I hope to see you then!
Reference Materials
51% Attack: Occurs when an individual or group controls more than 50% of a network’s mining or validating power, giving them the ability to manipulate, change, or create invalid transactions. As crypto networks scale, 51% attacks become exponentially harder to execute. This is due to 1. Increased computational power and 2. Increased financial incentives against dishonest actors.
Blockchain: A blockchain is a system of recording information (or a ledger) that makes it incredibly difficult to change, hack, or cheat the system. Blockchains are generally maintained by a network of individual, independent operators (or nodes) who each hold a “copy” of the growing ledger. As any one of these operators may have an incentive to adjust the ledger to their benefit, the ledger is only determined legitimate when a majority of operators agree on its state. As such, blockchains provide a data storage medium that cannot be controlled by any single person or entity. More technically, a blockchain is a growing list of records, called blocks, that are securely linked together using cryptography, and generally maintained by a network of nodes or validators. Each block, at a minimum, contains a cryptographic hash of the previous block and a timestamp.
Block: A Block is a data structure within a blockchain where information is stored and encrypted. Once a Block is created, it will exist forever, as long as the network in which it resides remains operational. Blocks generally contain transaction data, a time stamp, and proof of consensus but can be programmed to store any type of data. Once a blockchain is created, the type of Blocks it produces cannot be altered without performing a hard fork of the chain (definition below).
Block Size: Refers to the amount of data that can be stored within a block. Bitcoin currently has a block size of 1MB (although this has been a controversial topic throughout Bitcoin’s history. See Bitcoin’s Block Wars)
Block Time: a measure of the time it takes to produce one new block in a blockchain.
Block Height: Refers to a specific “location” within a blockchain, measured by how many blocks precede it. The current height of a blockchain indicates 1. The number of blocks within the chain 2. The size of the chain (height * block size) and 3. The blockchain’s time in existence (height * block time). Height can also be used to view the state of a blockchain network at a given point in time.
Byzantine Fault Tolerance: Byzantine Fault Tolerance describes a condition of a system in which individual components of the system may fail without compromising the integrity of the system as a whole. In the context of a blockchain, it refers to a chain’s ability to come to consensus about the validity of a transaction as long as ⅔ of its nodes are acting in an honest manner.
Consensus: A consensus mechanism is a fault-tolerant mechanism used by a network of siloed operators (nodes) to achieve agreement on the validity or state of a transaction or network. Many different consensus mechanisms exist within the blockchain ecosystem including Proof of Work, Proof of Stake, and Delegated Proof of Stake (definitions below). Some less common consensus mechanisms can be found here.
Double Spend Problem: The Double Spend Problem describes the difficulty of ensuring digital currencies are not duplicated or sent to two parties at once. In a situation where this is possible, users can “create” currency out of thin air, effectively nullifying the value of said currency. Historically, currencies have combatted the double spend problem through physical presence (like cash, which obviously cannot be in two places at once) or reliance on trusted third parties, such as banks, to maintain a ledger of transactions. Blockchains solve the double spend problem without either requirement.
Hash: In relation to crypto, hashes can be simply explained as a tool that us to securely and efficiently verify transactions have occurred. Technically speaking, a hash is a mathematical function that converts an input of arbitrary size or length into an encrypted, standardized output. Moreover, hash functions are one-way, meaning they can not be used to reverse engineer the input based on the hashed output. This provides a secure guarantee of past transactions without requiring us to see or store the transactions themselves.
Hash Rate: a measure of the total computational power being used by a cryptocurrency network, or its individual miners. In other words, the total hash rate measures a network’s overall security. Hash rate is measured by the number of hash calculations performed per second, represented in the following notations: KH/s (kilo, thousands), MH/s (mega, millions), GH/s (giga, billions), TH/s (tera, trillions), PH/s (peta, quadrillions), EH/s (exa, quintillions). In a Proof of Work system, the total network hash rate is determined by multiplying the time between mined blocks by the network difficulty at any given time.
Hash rate is also used as a measure of security of PoW systems. The higher a hash rate, the more difficult it is to implement a 51% attack (see below).
Merkle Tree: a Merkle Tree is a data structure used to store and map arbitrary-sized data in an efficient and secure manner. Merkle Trees consist of hashed transactions, which are concatenated in an upwards hierarchy, the top of which is called a “root.”
Once a root is determined, it is placed in a block, combined with other data such as a timestamp, difficulty target, the nonce, and the previous block’s hash. The combination of this data is then run through a hash function which generates the block’s unique hash. Merkle Trees allow nodes to confirm the validity of every transaction within a block, without being forced to download the actual data from said transaction.
Mining: Mining is implemented in Proof of Work consensus models and entails nodes completing complex
Mining revenue can be estimated with the formula P = (X/Y)*(E*D) where
P = daily revenue
X = node hash rate
Y = total network hash rate
E = emissions per block
D = average blocks per day
Node: A node refers to any device running a piece of the core software that makes up a blockchain (known as a client). There are a few different types of nodes, each with different purposes:
Full Node: Validates and stores the entire blockchain, block by block. Because the full node has a record of every transaction that has ever occurred on the chain, it participates in consensus and can be trusted to verify any and every block. Running a full node is energy intensive.
Light Node: A node that views the blockchain and validated “block headers.” Light nodes aren’t required to store the full chain and are less energy intensive.
Miner: A full node that also participates in the creation of new blocks.
State: A snapshot of a blockchain that reflects the “account” balances on the chain at any moment in time. The state must be agreed upon by a majority of nodes for new transactions to be added to the chain.
Sybil Attack: An attack on a computer system in which a user creates a large number of alternate accounts or identities that are used to gain a disproportionately large influence on the system. Sybil attacks are generally most effective in democratic systems.
Turing-Complete: Turing-completeness generally refers to computer languages that, given enough time and processing power, could theoretically perform any computation given. Ethereum has a built-in, turing-complete processing language that allows applications to be built on top of it
This isn’t completely true, see the reference materials for more info on nodes.
The output is not a specific 64-bit number but instead a hash with x number of 0’s in front of it. The difficulty of finding the nonce can be scaled up and down by requiring more or less 0’s at the beginning of the hash.
Satoshi Nakamoto discusses this in section 11 of Bitcoin’s original whitepaper.
Please note, this only describes Proof of Work consensus. Other consensus models, like Proof of Stake, derive security from something other than time and energy.