Ethereum LevelDB Explorer

Github repository - contributions are welcome

About this tool

The purpose of this tool is to be an educational resource for people trying to learn about Ethereum's data storage. We assume good understanding of how Ethereum's data looks like - block headers, accounts, storage slots, etc. If you ever wondered how Ethereum stores all of this data, this tool is for you.

The text here is meant to help guide you through some common use-cases for exploring Ethereum's low-level data structures. It is recommended to follow along with the examples using the query tool. The best way to understand these data structures is with visual, real-world examples.
The sections are ordered in a specific way and may depend on each other. For example, avoid trying to access storage slots before you understand how to traverse a patricia trie.

This tool is based on the Geth implementation using LevelDB.
Since the Yellow Paper never provided an actual implementation, different clients implement the Ethereum database differently, but because of the use of very specific mathematical concepts, most clients implement the database very similarly.
This tool works with the Sepolia testnet and is currently updated until block . To confirm the results you get from this tool, you can cross-check them on Etherscan.
With this tool you can manually explore the LevelDB data by querying specific keys. It's as low-level as possible, since its main purpose is educational.
All the data, both keys and values, is saved in a binary format. This tool accepts and responds with hex-encoded binary data.
The encoding scheme of the results that come from the LevelDB is beyond the scope of this text, but this tool provides several decoders that let you decode the results in different ways. Different values require different decoders — make sure you use the appropriate decoder for each piece of data.

Additional Resources

Very few resources exist on this topic, which is the main reason for building this tool. Still, some resources exist and I encourage you to read them as well. This is not an easy topic, but you can learn it.
Here are some resources I found useful:

To go deeper than that, unfortuantely, the only resource is the Geth source code.

LevelDB keys

Ethereum stores several top-level keys with mutable values:

LastBlock (in hex: 4c617374426c6f636b)
LastHeader (in hex: 4c617374486561646572)
LastFast (in hex: 4c61737446617374)
DatabaseVersion (in hex: 446174616261736556657273696f6e)

Keys relating to parsing blocks are constructed using different prefixes & suffixes. These are:

68 ("h")
48 ("H")
6e ("n")
62 ("b")
74 ("t")
72 ("r")
42 ("B")

The rest of the keys that are stored in the DB are hashes that comprise different Merkle tries. More on that below.

Retrieving Block Data

Retrieving block data requires using the block number and/or hash in order to construct the keys that hold the data that we want.

Example - Retrieving block headers

To find a block hash by its number, you would concatenate the "h" prefix (68) and the "n" suffix (6e). Once you find the block hash, you can retrieve its header data.
To find the header data, you would concatenate the "h" prefix (68) with the block number (in hex and padded to 16 digits) and the block hash.

Let's take block number 2,505,997 as an example

In hex, the number is 263d0d. Padded to 16 digits: 0000000000263d0d
To find the block hash, we query for 680000000000263d0d6e
Then we take the result (9d32afbe77c7d105253b4ed7750caf23063352936ce357b89a9dd54c9fa24ab1) and use the "h" prefix along with the block number and the hash to find the block header: 680000000000263d0d9d32afbe77c7d105253b4ed7750caf23063352936ce357b89a9dd54c9fa24ab1

We can use the "Block Header" decoder to parse the result into the different fields of the header. Of these fields, notable are:

stateRoot
transactionsRoot
recieptsRoot

These values represent root hashes that let you traverse their respective tries. More on that below.

Traversing Modified Merkle-Patricia Tries in LevelDB

Ethereum uses two kinds of tries to save data in a cryptographically secure yet efficient way:

Modified Patricia Trie
Merkle Trie

Explaining these trie structures or why Ethereum utilizes them is beyond the scope of this tool.

These two tries can be constructed from the key/value pairs in the DB. The keys in the DB are part of the Merkle trie, while the values are part of the Patricia trie.
Generally speaking, if you are trying to retrieve raw data, you need to construct the Patricia trie from the values, but if you are trying to validate the data, you need to construct the Merkle trie from the keys.

Example - Retrieving account data

Let's take block 2,500,039 (2625c7 in hex). We can get the hash and header data as explained above. From the header we can extract the stateRoot, which is:
644ae129f630e6c5c864b2dbd634c50fe479d631ef76ae1e9ceb5220bca949c5

The stateRoot is the key for our root nodes. The value stored on this key in the DB is the root Patricia node, while the key itself is the root Merkle node.
Querying this key will give us an RLP-encoded value that represents 1 of 4 kinds of Patricia nodes:

Empty Node (0 items)
Branch Node (17 items)
Leaf Node (2 items; value starts with 2 or 3)
Extension Node (2 items; value starts with 0 or 1)

Querying for our block's stateRoot gives us a 17-item long branch node. Each item in the branch node represents a hex character from 0 to f.
In order to find the balance of an account in the State Trie, we need to traverse the Patricia trie following the keccak256 hash of the address we want to query. Let's take the following address as an example:
0x3810d4c7eB88dd66ab9bE39A5F567Cf77fF9f8B7
Its keccak256 value (without the 0x part) is:
acf0daf35759515a3118de4ab5ff63ec27518b94b03d601ac7a1e53b3d6603f8
We need to traverse the Patricia trie for every character in the hash. We start from a which is the first character in our keccak hash.
We take the item at index 10 (which is a in hex) of the root Patricia node, which is:
c4ee4cf0cab88b6932d7380a6e0efdc33c1d4f0ffa05207f7a1450b45a97972a
We then query that key to get the next Patricia node. The next patricia node is also a branch node, and so we follow it, taking the key from the c place in the Patricia branch node, which is:
51f41878a482a7e1a60e91b8e5c66333d119339dc067363b681ad7f7e6581c39
We keep traversing this way, f, 0, d.
At this point, we get the next node's key:
8b97f78fa20cfba908a4953654b4fcdc55c94a3df3305548e1e16eb549c19672
When we query this key, we get a Leaf node. We can identify it because if we decode it with RLP, we get 2 items and the value of the first item starts with 3. This type of node is built of two items: the rest of our "path" and the final value of our account.
If we take the first item of this node and remove the 3, we get:
af35759515a3118de4ab5ff63ec27518b94b03d601ac7a1e53b3d6603f8
If this string seems familiar, it's because it's part of the hash that we were searching for:
acf0daf35759515a3118de4ab5ff63ec27518b94b03d601ac7a1e53b3d6603f8
We traversed through a, c, f, 0, d, to get to this node, which contains the rest of our hash and our desired value - 4 items, encoded with RLP, that represent (in order) the account data:

Nonce
Balance
storageRoot
codeHash

Looking at the balance, we get b1a2bc2ec50000, which converted to decimal becomes 50000000000000000. This balance is in wei, so it's 0.05 eth. We can confirm on Etherscan that at block height 2,500,039, address 0x3810d4c7eB88dd66ab9bE39A5F567Cf77fF9f8B7 had 0.05 eth in its balance.

The same technique of traversing the state trie can be used for traversing the transactions trie or the receipts trie.

Contract Storage

In order to understand how contract state is stored in the DB, we need to look at how the EVM handles contract state. This text assumes good understanding of how the EVM works and what opcodes are. Explaining these concepts is out-of-scope.
One more important thing to note: different languages (Solidity, Vyper, etc) compile into different bytecode. The process described here is based on how Solidity implements its compiler, but most other EVM languages mimic the same behavior. Some information about Solidity's scheme is presented in their documentation:
https://docs.soliditylang.org/en/v0.8.17/internals/layout_in_storage.html

To store contract state in the EVM, we use an opcode called SSTORE. The SSTORE opcode has two operands: a slot and a 256-bit array to be stored. The slot is a uint256. When a contract is trying to store some data, it invokes the SSTORE opcode with the slot number it wants to write into.
When you compile a Solidity contract, Solidity transforms it into bytecode. When you write to variable, Solidity translates its position in the code into a slot number. So the first variable goes into slot 0, the second goes into slot 1, etc.

Let's look at a simplified example:

contract Example {
    uint256 a = 123;
}

This contract will compile into something roughly equivalent of sstore(0, 123).
Important to note that the variable name is meaningless when referencing data inside Ethereum's LevelDB storage. The only thing that matters is its position, from which we can derive its slot.

Now let's get back to LevelDB. So how can we access this data from LevelDB?
When the EVM processes SSTORE, it's actually writing into LevelDB in the background. For every SSTORE operation, the EVM translates the slot into a LevelDB key where it will store the value.
Solidity supports 5 types of data structures:

Fixed-size values (numbers, addresses, short strings)
Fixed-size Arrays
Dynamic Arrays
Mappings
Structs

Each of these data structures is saved using a slightly different scheme. We will go over how some of these are saved using examples, but for a comprehensive guide to all data structures, consult the resources linked at the top.

To retrieve a storage slot from LevelDB, we need to use our account / contract address, retrieve the storageRoot of the account and then derive the slot we want to retrieve from its position in the code and the type of data structure it holds.

Example 1: Read the "name" of an ERC20

Let's look at contract 0x5fb282df60a3264c06b2cb36c74d0fd23d727f82. It's an ERC20 contract that follows the OpenZeppelin implementation. Looking at the code we can see the name variable is fourth, meaning it will be stored in slot 3 (0-based index).
We now retrieve the account details for this contract in the manner described previously. We will use block 2,505,997 as our head. Its block header can be found here:
680000000000263d0d9d32afbe77c7d105253b4ed7750caf23063352936ce357b89a9dd54c9fa24ab1
We take the state root and find the account details as we did previously. The keccak256 of the address is:
c6c986aabcc27ea73df5b336048692ab9cab96645861b869da7b6935a1aa29ab
We traverse the stateRoot trie same as before, until we reach the leaf node for this account:
f12c6be1635c47f9a9aaeef51429e19bf43bcde0fe1ee1894b331dd68e7cab74
From that we extract the storageRoot for the account. This is a Merkle-Patricia root, and we can traverse it like any other Merkle-Patricia Trie:
df20e5cf9e6aef54d16c6123d87957fe1c7c591a82cb03073432ec7375c65648

Now we can find our storage slots. To find slots, we take their index and find that index on the Patricia trie, starting from the storageRoot.
We know the "name" variable is stored in slot 3. Slot numbers are always padded to 32 bytes and then hased, so we take the padded number:
0000000000000000000000000000000000000000000000000000000000000003
Hash it with keccak256:
c2575a0e9e593c00f959f8c92f12db2869c3395a3b0502d05e2516446f71f85b
And traverse the storageRoot trie to find that key. After traversing c, 2, 5, we get to the leaf node that containes our value:
0b49a92e9302e8d45d0ce6acd86eee8ea4a83fc447bc7f9e629febb197ece43d

We can now see the value stored in slot 3, but it's encoded:
a04255534420546f6b656e00000000000000000000000000000000000000000014
Solidity encodes strings that are 31-bytes or smaller directly in a single slot. The first byte (in this example, a0) we ignore. The last byte (in this example, 14) encodes the length of string. 14 is hex, converted to decimal it's 20. So our string are the 20 digits following the a0 byte, and then those bits are our ASCII/UTF encoded string: 4255534420546f6b656e
We can use Javascript to decode it (any other language can also work):

'4255534420546f6b656e'.match(/.{1,2}/g).map(v => String.fromCharCode(parseInt(v, 16))).join('')

Or in Node.js:

Buffer.from('4255534420546f6b656e', 'hex').toString('utf8')

And we get: BUSD Token

Example 2: Get the ERC20 balance of an address

Mappings and other dynamic types are a bit more complicated to retrieve from the storage, becuase of how Solidity allocates slots for dynamic types.
Every slot in the EVM is 256-bits long. This means that if you want to save more than 256 bits, you need to come up with a scheme that would let you save a single variable in multiple slots. For fixed-size large values, a simple scheme / layout would be to stack the slots. Let's take as an example a fixed-size array:

contract Example {
    uint[2] list;
}

While we are defining a single variable, it will actually take up two slots. list[0] would be located in slot 0 while list[1] will be located in slot 1. Simple enough. Solidity uses something similar (remember: storage schemes / layouts are compiler-specific).
Mappings and other dynamic types are more problematic because they could theoretically grow infinitely, so we can't reserve slots for them. For that, Solidity uses a different layout model. Instead of packing the variables one after the other, Solidity uses a pseudo-random slot number that it generates from the position of the variable and the mapping key.

Let's continue with our BUSD Token contract from above. We know that it keeps its _balances variable in slot 0. But if we search for slot 0, which has the keccak hash:
290decd9548b62a8d60345a988386fc84ba6bc95484008f6362f93160ef3e563
We can't find it in the Patricia trie. That's because Solidity doesn't save anything in that slot. Instead, we need to look for the slot of a specific key inside the mapping. Solidity generates a different slot for every key in the mapping. To find the slot where the balance of an address is kept, combine the slot of the mapping and the key (i.e. the address) we are looking for. Let's take this address as an example: 0x8ab7b1954fbe6c39b146bffd2fb1e8c38138fb4d.
What Solidity does is it constructs a key from the address and the slot, using the following formula:

storageSlotNumber = keccak256(abi.encode(mappingKey, variableSlotPosition))

Where:

mappingKey is the key we are searching for. In our example it's the address 8ab7b1954fbe6c39b146bffd2fb1e8c38138fb4d
variableSlotPosition is the position of the mapping variable in the code. In our case, 0
abi.encode() is the standard EVM ABI encoding function, which mostly means we need to pad every paramer to 32 bytes.

So, in order to derive the actual slot number for our data, we take the key we are searching for:
8ab7b1954fbe6c39b146bffd2fb1e8c38138fb4d
Pad it to 32 bytes:
0000000000000000000000008ab7b1954fbe6c39b146bffd2fb1e8c38138fb4d
We then take the slot where the mapping variable is positioned, which is 0 and pad it to 32 bytes:
0000000000000000000000000000000000000000000000000000000000000000
We concatenate these two:

0000000000000000000000008ab7b1954fbe6c39b146bffd2fb1e8c38138fb4d0000000000000000000000000000000000000000000000000000000000000000

And we run them through keccak256, to get:
a194304cfaa67b7f4640d773719472e36ea5de258553109420ff3fb659aa3d1c
Now this is our storage slot number for the balance of address 0x8ab7b1954fbe6c39b146bffd2fb1e8c38138fb4d.
NOTE: This is our slot number. To find our slot key, we hash this number like we did with slot number 3. The result will be: ed98752026e9e727d97d787c433a482f543a72cfc1e944ffc2a72e460ebb2c4a
We can now traverse the storageRoot trie to find the balance for the address. After traversing e and d, we reach a leaf node that contains our value:
891b1ae4d6e2ef500000
Removing the first byte gives us: 1b1ae4d6e2ef500000, which comes out to 500*10^18 in decimal, or 500 "BUSD tokens".

Side note: maybe now you understand why it's impossible to iterate over mappings in Solidity — the keys are not saved anywhere in their raw form, only as hashed values which cannot be reverse-engineered into the original keys. To search for a value of a mapping, you must know the original key you are looking for.

Transactions & receipts

Coming soon

Logs

Coming soon