Data availability guide, Part 4: how Lumio on Solana solves the DA problem

L2 on Solana and the DA problem | PDA addresses and commitment schemes | Data storage costs on Solana | How Lumio L2 ensures DA on Solana

Dec 23, 2024

Lumio L2 on Solana uses program-derived accounts (PDA) together with efficient data compression and commitment schemes to ensure robust data availability at the lowest possible cost. In this final part of our DA guide, we’ll delve into these advanced concepts to understand how DA on Solana works.

Intro: the structure of Pontem’s guide to DA

If you haven’t read the first three parts of our epic guide to data availability, we suggest that you do that - otherwise it will be difficult to understand what we are talking about in this blog.

In Part 1, we talked about the meaning of data availability (DA) for various types of blockchain players, including full nodes, end users, and rollup sequencers.

In Part 2, we explored the leading DA solutions in the market, including Celestia, EigenDA, NEAR, and Polygon Avail. Though they can support chains built on different blockchain virtual machines and different settlement layers, these DA layers are mostly aimed at Ethereum rollups.

In Part 3, we looked at the L2 landscape on Solana, including Pontem-built Lumio, ZX by Zeta Markets, Eclipse, and ephemeral rollups by Magic Block. This topic is relevant to our discussion, as any potential 2 on Solana will need to solve the data availability problem.

Finally, in Part 4, we turn to how Lumio L2 solves the DA issue on Solana by using a combination of PDA addresses, commitments, and data compression. We’ll see how much it costs to store rollup data on Solana mainnet and how this amount can be minimized. It’s a pretty technical subject, but we will try to explain everything in the clearest way possible.

Reminder: what is Lumio on Solana?

Lumio is part of the Lumio L2 federation of rollups, which also includes SuperLumio, an EVM implementation based on Optimism’s OP Stack, and the current Lumio testnet that supports both EVM and Move VM.

Lumio on Solana technically supports three VMs: SVM, EMV, and Move VM. Once we have made sure that the devnet is stable, we will make Lumio on Solana the default testnet for the whole Lumio framework.

You can register to test Lumio on Solana in devnet here.

How Lumio L2 on Solana handles data availability (DA)

As we have discussed in Part 3, the maximum size for a Solana transaction is 1 kb (or, rather, 1,232 bytes, compared to 128 kb on Ethereum with the new blob transactions. 1 kilobyte is hardly enough for a whole batch of rollup transactions, which creates a whole new dimension for the DA problem on Solana: you have to use a different way to send and store rollup transactions to the mainnet.

Luckily, an alternative exists: PDAs, or program-derived accounts. They can store up to 10 MB of data (see the “Intro to PDAs” section of this article). Let’s see how they work.

Intro to PDAs

First of all, we need to distinguish between two types of accounts on Solana:

Keypair accounts: regular accounts that have a private key and a public key;
Accounts based on program-derived addresses (PDAs) - special addresses created by a smart contract (program) that don’t have their own private key.

To create a PDA, a program can use a string of numbers or some other predefined input. The address is derived in a deterministic way based on that input, the program’s ID, and an additional value called a bump seed.

The next step is to create an actual account with that PDA. The program is the only entity that can generate valid transaction signatures for the PDAs it created, so it maintains full control of them. In other words, there is no other user or program that could generate a valid signature for a specific PDA.

By the way, another name for a program on Solana is an executable account - because it executes code when it interacts with another account. Meanwhile, a regular user account or a PDA are called non-executable accounts, since they simply store data but don’t run any code.

Main reasons why Solana programs need PDAs

Storing user-related variables and data: addresses, funds, NFTs, or even - as in the case of Lumio on Solana - rollup transaction data. Unlike smart contracts on Ethereum, Solana programs don’t have built-in data storage space, so they have to use PDAs instead. The maximum that a single PDA can store is 10 MB.
Allowing programs to sign transactions. A Solana program doesn’t have a private key. And even if it did, it wouldn’t be able to sign transactions with it, because it would expose the key on-chain and allow anyone to exploit the program and steal all the assets that it controls. A PDA is a workaround, because it has the authority to sign on behalf of the program.
Cross-program invocations. When one program has to call a function in another program, a PDA is often involved. A good example is escrow: user A places tokens in escrow and signs with their private key, after which the program needs to transfer the tokens to User B (the recipient). This requires signing a transaction, which can only be done with a PDA.
Using PDAs instead of hashmaps.

PDA storage costs

A Solana program can create as many PDAs as it needs - as long as it can pay for them. Data storage costs money, as we all know - and while Google Drive may give you 10 GB of space for free, a decentralized network like Solana cannot. Instead, account owners have to lock up funds in order to use storage space, which is provided by validators. This is known as rent.

The rent amount is expressed in lamports - the smallest units of SOL (1 SOL = 1 billion lamports). The unit is named after Leslie Lamport, a famous computer scientist and pioneer in the study of distributed systems.

When you place enough lamports into a PDA account, it becomes rent-exempt: there is no need to pay regular rent, but the amount needs to remain locked there as long as the PDA is needed. The rent exemption threshold depends on how much data you need to store. When creating a new PDA, a developer has to indicate the expected storage amount and deposit enough lamports to ensure rent exemption.

When the developer closes the PDA, the lamports can be reclaimed. This is done by transferring the whole balance to another account (as there is no way to take out only part of the deposit).

Commitment schemes: the basics of KZG

A rollup like Lumio groups transactions into batches, compresses them, and posts them to the L1 (Solana in our case). But how can full nodes on Solana verify that Lumio’s batches are valid? We don’t want to make nodes decompress and read the whole block, as that would be expensive and a waste of resources. Instead, we need to generate a commitment (a kind of a proof) and attach it to a rollup block.

A commitment scheme in cryptography is a mechanism (or, rather, a primitive) that allows a party (a sender) to guarantee that a message or statement is true without having to reveal it straight away. In the commit phase of the scheme, the sender commits to the message; the commitment is binding, meaning that the verifier cannot change the statement once the commitment has been published.

The second phase of a commitment scheme is the reveal phase, when the sender discloses the statement and the receiver verifies it. It’s very important that there is only one commitment message or value that can be derived from each statement, and that there is no way to reverse-engineer the original statement from the commit message.

Commitment schemes differ in terms of how the underlying statement is constructed. One scheme that has emerged as particularly important in the crypto world is called polynomial: the message that a sender commits to is a polynomial - a mathematical expression that features variables (often raised to non-negative integer powers) and coefficients, for example:

p(x) = 7x³ - 3x² + 2x - 5

In a polynomial commitment scheme, the sender needs to be able to show that for a specific value of X the polynomial takes the value Y without having to reveal the whole expression. The receiver (also called verifier) can ask the sender to disclose the polynomial’s value for various values of variables and then the coefficients from the polynomial (in our example, they are 7, -3, 2, and -5) and then check if the results of these samples are compatible. In the context of data availability, this mechanism is the basis of DAS (data availability sampling).

What makes polynomial commitment schemes so valuable to rollups is that the size of the commitment message can be very small compared to the size of the polynomial. Thus, given that a polynomial is derived from the rollup’s transaction computations:

the commitment acts as sufficient proof that those computations are valid, and
the commitment message is much cheaper to write to the L1 than the original computations or the polynomials based on them.

The most popular model of polynomial commitment scheme in crypto is called KZG, which stands for Kate-Zaverucha-Goldberg. In particular, it’s used in Ethereum’s proto-danksharding framework.

Here we won’t discuss how rollup transactions are transformed into polynomials or all the steps of the commitment and verification process (check out this excellent article for more info). Instead, let’s see how the KZG commitment scheme is used in Lumio on Solana in combination with PDA accounts to achieve secure data availability.

How Lumio ensures data availability on Solana mainnet

Lumio block data is compressed to form blobs;
A KZG commitment is generated for compressed data;
Lumio’s DA program derives a PDA from the data ID and creates a PDA account for it;
The commitment is written to the PDA;
Lumio sends a series of transactions with chunks of the data (apart from the ones stored in the PDA for the commitment to be verified on the mainnet;
Once the commitment is proven correct, the data chunk is stored on the L1 with respect to its offset in the compressed blob of data;
Lumio block data and the commitment remain available in the PDA account for the time necessary for an interested party to generate a fraud proof, should they want to challenge a transaction (around 24-48 hours);
Lumio executes instructions to finalize the block, while the DA program on Solana marks the data in the block as available.
Once the fraud proof window is over, the commitment in the first PDA is overwritten with a new one, or the old PDA is deleted and a new one is created, so that the cycle can be repeated for a new block.

It’s important to stress that the number of PDA accounts used by Lumio is elastic: it will fluctuate depending on the network load and the amount of L1 storage that Lumio requires. This, together with the high level of compression, ensures that the costs of running dApps of Lumio are minimized.

Data from old Lumio blocks is stored on L2 archival nodes, so that data is always available. We are also considering integrating Lumio on Solana with one of the major DA layer providers in the future, just like we are going to do with SuperLumio (for which we have chosen EigenDA as our DA partner).

Lumio DA security

By using the commitment scheme and the PDA account framework, we make sure each of the accounts that we have constructed respects the commitments generated by Lumio L2. A chunk of data won’t be stored in a PDA account unless it is included in the commitment. This means that a Lumio block cannot be finalized on Solana unless all of its data chunks match the commitment.

This is a secure way to ensure data availability, demonstrate the validity of the rollup block data, and prove that the rollup doesn’t withhold any data.

Forecasting Lumio costs: PDA storage and running costs

The costs of running a rollup depend on several factors:

Network load (the number of transactions that the rollup generates in 24 hours);
Compression (how much the rollup manages to compress the data that it writes to the L1);
Duration of storage (for how long the rollup wants the data to remain available on the L1).

Calculating PDA storage fees

How much does it cost to run a rollup like Lumio using PDAs? Let’s first assume that we start with a single PDA (10 MB), though it will take time to fill it to capacity (depending on how much the rollup is used). There is a handy function in Solana to calculate the rent exemption deposit for 10 MB of storage space:

72.98 SOL is around $11,400 at the time of writing. That’s quite a lot of money, but remember that the money is a one-time deposit and that it will be eventually covered through the fees that dApps pay to use the rollup.

How much storage space will Lumio need once it’s scaled to capacity and has a big ecosystem of dApps? To calculate this, we need to evaluate the rollup’s load, or throughput.

On average, Solana processes 2,800 transactions per second, though only 13-15% of that is user-generated transactions; the rest are validator votes. That would make around 420 user transactions per second, though actually it’s more correct to calculate network load per slot - a fixed amount of time that a validator has to generate a block. Currently the slot time is set to 0.4 seconds.

A recent average throughput per slot on Solana is 600 kb, which comes down to 300 kb after compression. Lumio’s load will be much lower, however, and its compression much better. According to Pontem’s development team, we can estimate the average throughput of Lumio at 50 kb per slot (0.4s), or 125 kb per second (50/0.4).

Now we can calculate the total network load of Lumio in 24 hours:

125 kb/s*60s*60*24=10,800,000 kb = 10.299 gb.

Around 10.3 GB in 24 hours - that’s how much Lumio will need to handle at full capacity. That’s a bit over 1,000 fully filled PDA accounts. We have already seen that the rent exemption threshold for one 10 MB account is around 73 SOL, so for 10,000 accounts we get… that’s right, 73,000 SOL, or $11.3 million.

Once again, this number represents a theoretical case when Lumio has a throughput just 2 or 3 times lower than Solana itself, which is an ambitious goal (though we have always been ambitious!). At the beginning, we will need only a minimal amount of storage space, so that Lumio on Solana will always be cost-efficient to run.

Meanwhile, Lumio’s PDA storage will grow as needed, and if network activity temporarily slows down, excess lamport deposits can always be reclaimed and some PDA accounts closed. We are working on a fee model for the L2 that will make sure that the fixed costs of PDA storage are gradually recovered while at the same time keeping Lumio attractive fee-wise for dApps and end users.

Lumio running costs

A typical transaction on Solana requires 1 signature, and each signature costs 5,000 lamports (remember, a lamport is one-billionth of 1 SOL). Many consider this base fee structure inefficient, as it doesn’t change depending on fluctuating demand for blockspace, but we just have to assume that the fee will remain fixed at 5,000 lamports.

For this fee, you get 5,000 compute units, or CU (used to measure the amount of computational effort needed to complete an operation, similar to gas in Ethereum). That’s plenty, because it takes just 1 CU to hash 1 byte of transaction data. Now, as you remember, one transaction in Solana cannot exceed 1,232 bytes, so with 5,000 CU we could hash it almost 5 times over.

As we have calculated before, at full capacity Lumio on Solana would handle around 10.3 GB of data every 24 hours. So if 1 kb of data costs us 5,000 lamports, we would need:

(10.3 GB/1kb)*5,000 lamports = 50,484,544,000 lamports or 50,48 SOL (around $7,900).

This is actually much lower than the transaction fees for the same amount of transaction data on Solana 1 (that is, regular transactions not using Lumio’s compression).

What’s next for Lumio on Solana?

We firmly believe that dApp developers on Solana will gain a lot from the opportunity to access other ecosystems that Lumio L2 opens for them - and vice versa, that projects in other ecosystems will benefit from this unique chance to connect to the Solana ecosystem.

At Pontem, we take data availability and rollup security very seriously. We have constructed Lumio’s DA scheme very carefully and taking into account the potential future scaling of the ecosystem. No matter how many dApps run on Lumio, block data will always be made available on the L1 for the time required and at the lowest possible cost.

Want to be among the first to test Lumio on Solana? Then apply to get whitelisted for public devnet access here. And of course, keep following us on X and in Telegram so that you don’t miss any Lumio news!

Lumio’s Substack

Discussion about this post