Skip to content

Dreams of Randomness and Constant Time Complexity

Posted on:22 July 2024 at 16:45

Table of Contents

Open Table of Contents

Intro

It all starts with a simple question:

"Why isn't there a smart contract to which you can just send Ether and have a chance of getting it back?"

This is a very innocent question that actually leads to complex questions on randomness in deterministic environments and achieving (near) constant-time complexity. In this post, I break down how we can achieve both and at what trade-offs. I also suggest a smart contract that achieves exactly what was asked for.

Randomness in the Ethereum Virtual Machine

The EVM, like all digital computers, is deterministic. This is bad whenever we need to access randomness. That’s also why some companies maintain walls of lava lamps – to generate randomness.

True randomness, or entropy for our functions, must therefore come from outside of our deterministic system. One such method is using an oracle to bring this lava-data onchain. For example, Chainlink’s VRF (verifiable random function) and Pyth’s IEntropyConsumer interface can be used to access randomness. This randomness comes with a cost, in the case of both Chainlink and Pyth. Chainlink’s payments are more complex as they require holding a $LINK subscription balance. Pyth is easier to access as we just pass some wei in the msg.value.

However, there is another way to access randomness as well. And it’s free. And it doesn’t introduce a new centralisation vector to our program.

But it comes at a cost: time.

Ethereum has implemented a new opcode, PREVRANDAO since the Paris upgrade, better known as the merge. It was formerly DIFFICULTY but with the transition into PoS, PREVRANDAO is now derived from the Beacon Chain’s (post-Merge Ethereum’s consensus layer) RANDAO process. In a nutshell, validators commit random values which are later revealed to produce a final random value. This random value can be easily accessed via Solidity’s block.prevrandao property.

Now, all historical and current data on the blockchain is public. This makes PREVRANDAO susceptible to exploitation by validators/proposers who have access to the historical/current data. They could choose to not validate blocks where the they are disadvantaged, if there’s money on the line for correctly predicting the PREVRANDAO. Therefore, we must use a future block. Hence the time cost.

EIP-4399, which originally proposed replacing DIFFICULTY with PREVRANDAO at Merge, suggests using a “lookahead” period to ensure validators cannot exploit their influence power. This lookahead period is determined as the sum of slots (blocks) until the end of the epoch is reached, plus some lookahead epochs (at least 4 suggested), and a few slots into the future epoch. As there are 32 slots per epoch, 132 slot lookahead period should be sufficient, at least for low value applications. More on randomness manipulation here.

The third-option would be to create our own oracle by generating random numbers offchain and then updating a smart contract with them every now and then. That, however, has the same drawbacks as any other oracle (except for having to pay for it, at least onchain) plus the huge centralisation risk that we ourselves bring. Much worse trust assumptions than a third-party oracle.

Therefore, the only real options for accessing randomness on Ethereum are:

Importance of Time Complexity on Blockchains

Now that we have figured out a few ways to access randomness with our application, let’s think about how we can determine a winner under the original question.

If we have access to instant randomness, i.e., each transaction can be assigned its own random value, we could simply look at, for example, the transaction hash and compare its and the randomness’ modulo to some factor we set. This factor would determine the likelihood of winning. Stated otherwise: uint256(txhash) % uint256(randomness) < λ : winner : loser where λ is the probability factor.

In a situation where we do not have instant randomness, we will need to accumulate transactions into an array and search for a winner from that array. Or actually a mapping, because mappings are much more gas efficient than arrays. Searching over a list has always a worse time complexity than choosing the winner at runtime. The latter runs at O(1). That is very important as searching over large arrays gets super expensive, super quickly.

Take linear search for example. It has the average time complexity of O(n)/2. Assuming we have 150 entries in our list, we need to, on average, search over 75 entries. This costs a whopping 75x more than an O(1) complexity search. When a search costs real and non-trivial amounts of money, this becomes a large factor to consider.

Luckily we also have other search algorithms. Jump search, for example, has a time complexity of O(sqrt(n)). For our 150 entry list this would be 12x more expensive than a constant time complexity search. Another option is binary search with time complexity of O(log2(n)) – 8 iterations for our 150 entry list. Interpolation search would be even more efficient at O(log2(log2(n))) – only 3 iterations for our list.

So there’s plenty from which to choose but when programming for blockchain we need to consider three important factors:

  1. How large is our data set, on average?
  2. How difficult (=expensive & risky) is it to implement the search alorigthm?
  3. How expensive is the average search case?

To illustrate these trade-offs, consider three data sets where n=10, n=150, and n=3000.

Assuming we can implement a linear search algorithm where a single iteration costs 10 gas, a binary search at 40 gas, and an interpolation search at 110 gas, which is the most efficient method for each?

n  10  150  3000
Linear 100 1500 30000
Binary 160 320 480
Interpolation 220 330 440
Table 1: Hypothetical average gas cost of a search in various data sets

Turns out that each search algorithm performs differently for each data set. Therefore, when you are seriously choosing which search algorithm to use, you would need to make some assumptions about the program and its users.

The additional component to consider is risk. Implementing a linear search is trivial. There are well-tested libraries for binary. But I couldn’t find one interpolation. Are you willing to create your own implementation? What’s the cost of auditing? What if the implementation is still bad?

The Tradeoffs

Now we have a basic understanding of the core components of our program. We need to access randomness and we need to – potentially – iterate over a list of entrants to find a winner.

The three major tradeoffs, or hurdles, are:

We have the most decentralised and immutable system when we are not using an oracle. This, however, comes at the expense of time-to-result and increased cost of participation due to the necessary storage of data and use of a search algorithm instead of a simple if-then expression.

If we want to get results quickly, we must have instant randomness. This comes at the cost of using a trusted oracle and, in certain cases, a maximum of one participation per block.

If we want to make the cost of participation low, we must again trust an oracle. If we are seaching from a list of entries, we could also choose to vary the size of the reward, instead of the probability of winning, giving each transaction an equal chance to win a variable price. This would be cheaper to implement.

In the below example implementation we require maximum decentralisation at the expense of time-to-result and a higher cost of participation. We also opt-in for a fixed reward instead of a variable payoff as the latter would make the system unattractive for anyone who’s getting rid of fewer ETH than the average user. A variable payoff system also discriminates against high-rollers as it we would need to implement a cap on the reward size to protect the system from getting fully drained.

While the cost of participation could’ve been brought down by allowing for a variable payoff, simplifying our search algorithm, I have opted-in for a fixed reward.

Example Implementation

This example implementation assumes ETH price of €6k, average transaction size of €20, and complete gas price insensitivity. We have come to these assumptions from researching the type of users we expect to have. This implementation is built so that people could get rid off low amounts of ETH they have in one of their tens or hundreds of wallets, without connecting them to each other. In other words, these are “meaningless” funds where the cost (time required) of recouping them is higher than the cost of just removing the wallet.

The example solution provides another alternative to removing the wallet and losing a few euros. Instead, we allow the wallet owners an easy method to “drain” their wallet by the means of a simple transaction into a smart contract, with a chance to get a meaningful amount of ETH. Given that EV(do nothing)=0, EV(recoup)<0, and cost of using our program is zero, EV(use our app)>0.

Our primary requirements are:

  1. Maximise decentralisation and build an immutable program
  2. Make the program extremely easy to use (no signatures or contract calls)
  3. Get results pretty quickly (within an hour)
  4. Minimum reward must be meaningful for our target audience

From these requirements we have the blueprint ready:

  1. Use PREVRANDAO for randomness
  2. Utilise receive() external payable in our smart contract instead of a more complex interaction
  3. Set lookahead period to 132 blocks, resulting in the worst-case transaction resolving in ~1h (assuming unconstrained demand for our program)
  4. Require a minimum reward of 0.5 ETH

Now that we have the requirements and the blueprint ready, we can start the implementation.

Firstly, we will need to accumulate transactions in two distinct periods as we will need to wait for the lookahead period to pass before rewarding one set of transactions. Therefore, we will have different lists for storing the transactions; one that is “active” to which transactions are accumulated, and one that is “inactive”, waiting for the lookahead period to finish. We’ll track which list we use with the boolean useA which is true if we are accumulating transactions into A and B is waiting for lookahead period to finish.

The data that we need to determine and reward a winner are the address of the winner and the amount of ETH they sent to determine their chance relative to others. For the sake of making effective use of a binary search algorithm, we will store the ETH amount sent in a cumulative format in a struct as follows:

struct Transaction {
  address sender;
  uint256 cumulativeWeight;
}

These transactions will be stored in two mappings (a more gas efficient way of storing lists than an array) as follows:

mapping(uint256 => Transaction) private txA;
mapping(uint256 => Transaction) private txB;

And we will shift between using one list and another by inspecting the current block number against the sum of the block number and lookahead period in which we last changed the list we are using. Therefore, we will include the following in our receive() function, which is the key – and only – function of this program.

if (useA) {
  uint256 newCumulativeWeight = cumWA + msg.value;
  txA[txCountA] = Transaction({
    sender: msg.sender,
    cumulativeWeight: newCumulativeWeight
  });
  cumWA = newCumulativeWeight;
  ++txCountA;
} else {
  // same for B list
}

if (block.number >= targetSlot) {
  // process transactions from the list that is waiting
  // set a new target slot
}
// else do nothing

Processing the transactions means a) ensuring we have the minimum reward available and, if not, moving the target slot forward in time; b) choosing the winner; and c) paying the winner.

We ensure we have the minimum reward by:

if (last cumulative weight recorded < 0.5 ETH) {
  rewardSkip = true;
  return;
}

Where the rewardSkip variable triggers skipping the winner choosing process (and subsequently payout) while moving the targetSlot blocknumber forward.

We choose the winner by:

uint256 winningWeight = block.prevrandao % totalWeight;
if (useA) {
  return binarySearch_8Er(txB, txCountB, winningWeight);
} else {
  return binarySearch_8Er(txA, txCountA, winningWeight);
}

Where the totalWeight is the total cumulative weight of transactions in the list we are processing. The binary search is simply implemented by:

uint256 low;
uint256 high = count - 1;
while (low < high) {
  uint256 mid = (high & low) + (high ^ low) / 2; // Mid strictly less than high
  if (transactions[mid].cumulativeWeight <= target) {
    low = mid + 1;
  } else {
    high = mid;
  }
}
return transactions[low];

Finally, paying the winner is simply:

if (winnerTx.sender != address(this)) {
  (bool sent, ) = winnerTx.sender.call{value: rewardAmount}("");
  require(sent, "FR");
}

I packaged these core pieces into a slightly gas-optimised smart contract that utilises a few uint128 to reduce the amount of gas we are using in addition to optimised function names and minimised variable visibility where possible. That smart contract is available in this repo as an example implementation.

Without forgetting to write a few tests, a gas tracking tool, and a reward simulator, we have a ready to use immutable on-chain programme that works by simply sending ETH into it without additional signatures, smart contract calls, or keeper bots. Furthermore, we got the gas costs down to sub-100k for ordinary transactions and to typically <150k for the transaction that chooses and pays the winner. That’s less than a Uniswap v3 swap.

All in all, this experiment showed how complex a seemingly simple problem was to solve. My first reaction when somebody asked for this was that of course it exists. But I didn’t find one. And now I probably know why. Randomness is hard onchain and gas-efficient search algorithms are still rare.

Edit: somebody implemented this program onchain, available at randomreward.eth