The Hand-off Problem
Practical Limitations on Forced Inclusion Mechanisms for Censorship Resistance
Authors’ Note: init4 is a research collective working on next-generation Ethereum tools. This is a research note, not a disclosure document. While we will discuss nuances and gaps in the security models of deployed and proposed systems, it would be hyperbolic to describe these as “vulnerabilities” or as “previously unknown”.
Censorship resistance remains a core value of cryptocurrencies in general, and Ethereum in particular. We believe that the benefits of transacting on-chain should be available to anyone, and that the rules of the chain should apply to all users and uses equally. Values drive this space forward. Engineering is the process of testing our values against reality to find where and how they break.
Defining Censorship
We tend to model censorship as intentionally preventing transactions from appearing in the canonical ordering (i.e. transaction exclusion). We consider orders to be fair or neutral when they depend entirely on the economic outcome for the ordering system, and unfair or censored when they depend on other information. E.g. when creating a block, it is acceptable to refuse to include a low-fee transaction, but unacceptable to refuse to include a transaction because it was sent by a specific person. Therefore, we say a transaction is censored if its inclusion depends on non-economic information. If the transaction creates more observable profit for the ordering system than any included transaction, but is not itself included, it is considered to be censored. This definition motivates work on forced inclusion mechanisms for censorship resistance. If a user can force the inclusion of their transaction, then it cannot be censored under this definition.
Forced Inclusion Mechanisms
A core security goal of OP Stack chains is that the Sequencer should not be able to prevent users from submitting transactions to the L2 chain.
- Optimism
Modern rollups tend to have centralized sequencing. This means that censorship by the sequencer is trivial, as they can simply choose not to include any given user transaction. To mitigate this, several rollups - including Optimism and Arbitrum - have forced inclusion mechanisms. These mechanisms allow a user to ensure that their transaction will be executed by the rollup after some time delay, regardless of the sequencer’s behavior. Inclusion is forced via a contract deployed on L1. Forced inclusion transactions therefore (theoretically) have the same resistance to censorship as other Ethereum transactions.
A forced inclusion mechanism has also been proposed for Ethereum via EIP-7547. Inclusion lists would allow block proposals to partially specify the contents of the next block. Under the assumption that block proposers have fewer incentives to censor than block builders, it would provide an effective mitigation to censorship.
Generally speaking, forced inclusion mechanisms create new constraints on valid orderings. They make broad classes of orderings invalid according to protocol rules1. Forced inclusion should be thought of as allowing the user to specify a subset of the future ordering. All valid orderings must expand upon the forced subset.
Expanding Our Model of Censorship
Unfortunately, transaction confirmation is the means, not the end. Our model of censorship is incomplete!
Censorship must be defined in terms of goals. Users want to send tokens, buy NFTs, borrow funds, etc etc. The transaction’s confirmation is incidental to that goal2. Censors, similarly, have specific goals. That goal may be to prevent some hack transaction from succeeding, to comply with some law, or to interfere with a competitor’s business. Respecting the intent of both parties, we need to redefine censorship.
With this in mind, we should expand our definition of censorship to say: a transaction is censored if a third party can prevent it from achieving its goal. Which is to say, the censor does not need to prevent the transaction from confirming; they only need to make smart contract execution revert. Making an EVM transaction revert is censorship of that transaction, despite that transaction being part of the canonical chain. Threat models that are blind to the contents of a transaction do not accurately model censorship, and therefore cannot effectively protect users from it.
Semi-formally, we would say that for any given interaction with the chain, there is some scoring predicate f that evaluates the resulting ordering, and produces 0 (goal failed) or 1 (goal achieved). In this model, the censor’s scoring function is simply the negation: f’ = !f. The censor achieves their goal when the user does not.3
While the user’s goal may be hidden, the transaction itself almost always contains enough information to infer it. Uniswap trades have obvious goals. In addition, because blockchains are deterministic, the censor can perfectly predict which orderings will satisfy the censor’s predicate. As a result, users cannot rely on hidden information to protect them from censorship in the EVM model. Transaction details must be public, which means that information about the user’s goal is public.
For simplicity, let’s assume that we’re working in the standard run-ahead sequencer model4. We will deal with the forced inclusion under sequencer rotation later. In this model, the sequencer has total control over the sequence, and can perfectly simulate any outcome. Which is to say, they are free to choose from the set of all possible orderings. Our semi-formal question about censorship then becomes “does there exist some valid ordering over which f evaluates to 0”? If such an ordering exists, then the sequencer can select it.
From there, we can expand our model to account for forced inclusion. “Does there exist some valid ordering, which includes the user’s sub-ordering, for which f evaluates to 0?” If such an ordering exists, the sequencer can select it. Forced inclusion does not prevent the sequencer from exercising control over the ordering, it only constrains their behavior. Unfortunately, forced inclusion has fundamental issues that prevent it from being an effective censorship resistance mechanism for many transactions.
The Hand-off Problem
Forced inclusion mechanisms mean ordering happens in one of two modes: unforced or forced. There is a defined point at which it transitions from unforced to forced (and vice versa). That point is the hand-off. The hand-off poses a thorny issue for the design of forced inclusion mechanisms.
The forced transaction executes on the state at the hand-off. So once again, state contention5 rears its ugly head. When the hand-off occurs within a batch of transactions (like a block), the creator of the batch can exercise control over the state at the hand-off. If the forced inclusion transaction reads any public state, then the creator of the batch can rewrite that state before and after the forced transaction execution. Contention is sufficient for censorship.
Because the batch creator can exercise control over the state at the hand-off, then it can affect the outcome of the forced transaction. If it can affect the outcome, it can potentially affect the result of the scoring predicate. For example, consider a simple AMM interaction. The user sets a minimum acceptable price, however, the batch creator can ensure that at the handoff point the market price is below the minimum acceptable price. This causes the user transaction to revert, effectively censoring the user.67
Interestingly, censorship via state contention is more effective than censorship via exclusion. An excluded transaction can be included in future blocks. A transaction censored via contention is permanently invalidated. It has been included in the chain, and can never be included again. That transaction can never achieve the user’s goal. To try again, the user must recreate and resubmit the transaction (which can then be potentially censored again).
In practical systems
[A]ny user can bypass the Sequencer entirely to submit any Arbitrum transaction (including one that, say, initiates an L2 to L1 message to withdraw funds) directly from layer 1. Thus [sic] mechanism thereby preserves censorship resistance even if the Sequencer is being completely unresponsive or even malicious.
- Arbitrum
In the run-ahead sequencing model, the sequencer has near-perfect control over the location of the handoff in the sequence, and pays reduced fees (as they need not tip and can exercise some control over the EIP-1559 basefee). As a result, the sequencer is in a privileged position to use state contention to censor user actions. It is trivial. The hand-off problem ensures that the sequencer can censor large classes of transactions.
For EIP-7547, builders choose where in the block hand-offs occur.8 As a result, the builder is capable of choosing the location of the hand-off within the block. This means that they can select a prefix and a postfix at will,9 as long as they respect block gas rules. The prefix can put the chain into a state on which the transaction will revert, while the postfix will restore the chain to a normal state. EIP-7547 inclusion lists are not sufficient to prevent censorship for any transaction accessing contentious state. The hand-off problem prevents ILs from ensuring transaction execution in most cases.
Forced inclusion is ineffective at protecting users from censorship for most non-trivial uses of the blockchain. The hand-off problem ensures that the censor has sufficient discretion over state even if they don’t have sufficient discretion over order. This problem affects AMMs, lending markets, auctions, and most other DeFi actions. Many important actions are censorable even if you can guarantee transaction inclusion. State contention creates hard limits on the effectiveness of forced inclusion as a censorship-resistance mechanism.
Case study
To see the far-reaching effects of this, consider a user lending USDC in a lending market on Optimism. When the user wants to withdraw USDC from the market, they submit an Optimism transaction, which the sequencer censors. They then use the official forced inclusion mechanism to queue their transaction on Ethereum, bypassing the sequencer.
The sequencer can see that transaction in the queue, and can choose to sandwich it. In order to censor the transaction, the sequencer borrows all available USDC from the market immediately before the forced transaction. Because the market no longer has liquidity, the forced transaction reverts. The sequencer can then repay the USDC immediately.
This requires the sequencer to have collateral sufficient to borrow the USDC, but it imposes only an extremely small borrow cost.10 Furthermore, the collateral is reusable for all censorship, as the borrow is never held open. As a result, a user of AAVE or Compound on Optimism (or Arbitrum or any other centrally-sequenced rollup) has no guarantee that they will be able to withdraw collateral ever. The sequencer can censor any withdrawal from any lending market at any time. Forced inclusion is simply not sufficient to protect users against censorship.
Followup Work
We have a few areas of followup research.
First, EIP-7547 can be trivially improved by requiring IL transactions be processed at the end of the next block. In the PBS auction context, censorship is MEV. The builder derives some non-economic value, to which they must assign a subjective value denominated in ETH. Censorship by the builder therefore causes an increase in the builder’s block bid.11 This extends to searchers, who may make censoring bundles. Some of the economic value of censorship is then captured by the proposer, providing an incentive to tolerate censorship even when not participating in it directly. Placing forced inclusion transactions at the end of the block removes the block builder’s ability to trivially sandwich the IL transactions, and increases the economic cost of contentious censorship. E.g. censoring an AMM interaction via state contention could require giving up some AMM arbitrage or a high cost to push the market out of range that cannot be recouped by closing the sandwich. In addition, this would restrict censorship bundles produced by Searchers (rather than builders) to one per block.12 We would recommend top-of-block execution, as the prefix is more significant than the postfix, however, that would drastically increase the cost of an IL transaction, as it would allow top-of-block MEV extraction via forced inclusion.13 Removing the censor’s right to atomically postfix the IL transactions is a small improvement.
Second, the hand-off problem exists because the censor can look ahead via transaction simulation and exercise control over the input state. Many MEV-resistance mechanisms introduce hidden information to remove the censor’s ability to derive information about users’ goals and to simulate outcomes. Typically these are commit-reveal schemes, where some transaction information is private until after ordering occurs. Ordering-execution separation and hidden information seem promising, but are largely incompatible with the MEV supply chain, Ethereum consensus processes, and the sequenced rollup model. Some way to negate the ability to simulate transactions would address censorship and large classes of MEV, but would be extremely invasive to the protocol, the operators of the protocol, applications, and the end user.
Third, there is an interesting class of “order-independent” scoring functions. These are goals that cannot be censored by state contention, either because they don’t access contentious state, or because the contentious state they do access has sufficient constraints to make it “reliable” in some sense. Order-independent actions include sending ETH to an EOA, most ERC-20 sends,14 and some DeFi interactions like adding collateral to a market. These actions are protected from censorship via contention. This class of goals also has interesting correspondences in safe cross-chain communication and MEV resistance and is worth more in-depth study. Applications and protocols may be designed to include only order-independent actions in some cases, but more study is needed.
Conclusion
Rich state allows malicious actors to censor transactions while still including them. The hand-off problem is fundamental to forced inclusion mechanisms, and can only be mitigated. In centrally sequenced rollups, no mitigation is possible. Forced inclusion cannot address censorship in the presence of contentious state. Large classes of economically important transactions can be censored, even if included by force. The hand-off problem is endemic in modern rollups, and present in Ethereum’s censorship resistance EIPs. As a result, forced inclusion, while beneficial, is never sufficient to provide censorship resistance for rich-state chains. Rollups do not “inherit” Ethereum’s security properties and it’s silly to suggest that they do. When you stop obsessing over inclusion, it becomes obvious that censorship resistance is a special case of MEV resistance.
We’d like to thank Mike Neuder, Tarun Chitra, and Brandon Curtis for review and feedback.
As is typical, for L1s this is accomplished by rejecting invalid blocks, while in rollups it is accomplished by coercing invalid sequences to valid sequences via some filtration function.
This is not a post about intents, the world does not need more of those at this point.
This is obviously an incomplete model, as it doesn’t take into account the subjective values of the outcomes. E.g. the censor may stand to lose any amount of money if censorship fails (e.g. because they could get arrested by French police if they fail to censor certain behavior). On the other hand, the user could stand to gain/lose any amount of money if their goal is not achieved on a specific timeframe (e.g. they have taken $100mm+ in loans against their own token and need to re-collateralize the position before they get liquidated).
As opposed to a “based” sequencer model. In most modern rollups the sequencer “runs ahead” of Ethereum as it provides inclusion and execution attestations for a transaction before the transaction has been committed to Ethereum. In this model the Sequencer has total control over the sequence, and the outcome of the transaction must be independent of Ethereum reorgs.
When multiple users want to access the same contract, asset or state, their transactions “contend” with each other and potentially interfere with each others’ outcomes. Contention may arise coincidentally, or deliberately. This is an intractable problem of rich state in blockchain systems. Public access to shared state is the root of MEV, scalability problems, and the decline of civility in modern society.
Generally, you should think of censorship via state contention as a specialized case of MEV. Because the value extracted is off-chain, non-observable, and potentially very large, it may be difficult to predict when censorship via state contention will occur.
We specifically covered using state contention to revert transactions in our 2017 article “Miners Aren’t Your Friends”. Back then the term “MEV” was not yet in usage.
It’s well-known that PBS dramatically complicates censorship resistance. See VB’s research note.
Prefixing and postfixing a transaction is commonly called “sandwiching” and is well-understood as a method of using state contention to extract MEV.
The borrow is held for only a few seconds, if that. Rollup sequencers can in some cases hold timestamps or block boundaries to make the effective borrow time 0.
The builder will be willing to pay up to their subjective value of censorship, potentially pushing the bid above the objectively observable extractable value of the block. In extreme cases this can result in instances where the censor has a negative ETH balance change (i.e. they pay more ETH to produce the block than they receive in fees and rewards).
Note that this relies on MEV auction rules preventing interleaving transactions from different bundles and not allowing “must revert” transactions. If those rules were relaxed to allow bundle txns to be interleaved, and/or if builders started to support “must revert” blocks in bundles, the protection would evaporate. This dynamic arises because if IL transactions must be at end of block, no non-forced transactions may be interleaved, and therefore at most one searcher censorship bundle could occur.
Effectively allowing the builder to create limited inter-block bundles. Pre-consensus systems like FOCIL could mitigate this.
For a standard ERC-20 token, transfer call is usually not censorable via contention, as third-parties cannot decrease the user’s balance. However, consider a transferFrom call. If the approved transferor is a contract that allows contention on its own state, then the action may be censorable via that contention (consuming the approval required for the transferFrom in some unintended way).