Introduction
Solidity is a programming language for writing smart contracts on the Ethereum blockchain. It is an object-oriented, high-level language that is similar to JavaScript. However, Solidity also has low-level features that allow developers to interact with the Ethereum Virtual Machine (EVM) at a lower level. Two of these features are ABI encoding and opcodes.
Prerequisites
This tutorial is for already inclined and experienced developers with prior knowledge of smart contracts and the EVM, but it can also serve as a refresher for experienced developers.
ABI Encoding
The Application Binary Interface (ABI) is a standardized way to encode data for system communication. In the case of Solidity, ABI encoding helps to encode data for communication between the Ethereum blockchain and external systems.
When you define a function in Solidity, you can specify its input and output parameters. These parameters have types, such as uint256, string, or address. When you call a function, the input parameters must be encoded so the EVM can understand. Similarly, when a function returns a value, that value needs to get encoded in a way external systems can understand.
ABI encoding takes care of this encoding and decoding process automatically. Solidity provides a built-in ABI encoder that you can use to encode data. Here are three types of encoding Solidity provides:
1. abi.encodeWithSignature
This function takes the function signature and its parameters as arguments and returns a byte array containing the encoded data. For example:
function add(uint256 a, uint256 b) public pure returns (uint256) {
return a + b;
}
function test() public pure {
uint256 a = 123;
uint256 b = 456;
bytes memory data = abi.encodeWithSignature("add(uint256,uint256)", a, b);
}
In this example, the add
function takes two uint256
input parameters and returns a uint256
value. The test
function calls the add
function and encodes the input parameters using the abi.encodeWithSignature
function.
2. abi.encode
The abi.encode function takes the parameters as arguments and returns a byte array containing the encoded data. This is useful when you don’t know the function signature at compile time. For example:
function test() public pure {
uint256 a = 123;
uint256 b = 456;
bytes memory data = abi.encode(a, b);
}
In this example, the test
function encodes the input parameters using the abi.encode
function.
3. abi.encodePacked
The abi.encodePacked function takes the parameters as arguments and returns a byte array containing the tightly packed encoded data. This means that the data is not padded to fit into 32-byte chunks. This is useful for saving gas costs by reducing the data size. For example:
function test() public pure {
uint256 a = 123;
uint256 b = 456;
bytes memory data = abi.encodePacked(a, b);
}
In this example, the test
function encodes the input parameters using the abi.encodePacked
function.
Comparing abi.encode and abi.encodedPacked
Optimization
Gas is a measure of computational effort required to execute an operation on the network. When it comes to abi.encode
and abi.encodePacked
in Solidity, the gas cost depends on the number of arguments getting encoded, their types, and the encoding method used.
Generally speaking, using abi.encodePacked
will result in a lower gas cost than abi.encode
. This is because abi.encode
adds padding to ensure the encoded arguments are correctly aligned, which can result in additional gas costs. On the other hand, abi.encodePacked
produces a tightly packed byte array, which can reduce the amount of gas required to store and transmit the data.
However, it’s important to note that gas cost can vary depending on the specific arguments getting encoded, as well as other factors such as the complexity of the contract and the current state of the network. Therefore, it’s important always to test and optimize your smart contract code to ensure that it’s as efficient as possible in terms of gas consumption.
Unoptimized Contract
pragma solidity ^0.8.0;
contract UnoptimizedContract {
struct Person {
string name;
uint age;
}
Person[] public people;
function addPerson(string memory _name, uint _age) public {
people.push(Person(_name, _age));
}
function getPerson(uint index) public view returns (bytes memory) {
Person memory p = people[index];
return abi.encode(p.name, p.age);
}
}
In this contract, we define a struct Person
representing a person’s name and age. We also define an array of Person
structs called people
and two functions, addPerson
and getPerson.
The addPerson
function takes a name
and age
parameter and adds a new Person
struct to the people
array.
The getPerson
function takes an index
parameter and returns the encoded value of the name
and age
of the Person
at the specified index in the people
array. We use the abi.encode
function to produce an array containing the name
and age
values. However, this creates unnecessary overhead because the function pads the values to ensure they are of a certain length.
Optimized Contract:
pragma solidity ^0.8.0;
contract OptimizedContract {
struct Person {
string name;
uint age;
}
Person[] public people;
function addPerson(string memory _name, uint _age) public {
people.push(Person(_name, _age));
}
function getPerson(uint index) public view returns (bytes memory) {
Person memory p = people[index];
return abi.encodePacked(bytes(p.name), bytes32(p.age));
}
}
In the optimized contract, we use abi.encodePacked
instead of abi.encode
in the getPerson
function to produce a tightly packed byte array that contains the name
and age
values, reducing the amount of gas required to store and transmit the data.
By using abi.encodePacked
instead of abi.encode
, we can optimize the gas consumption of the getPerson
function. Additionally, since we’re storing the data as a tightly packed byte array, we can use this data more efficiently in other parts of our smart contract, further reducing gas costs.
Hash Collision Problem
Collisions can occur with abi.encode
and abi.encodePacked
. However, the likelihood of a collision occurring with abi.encodePacked
is slightly higher because it produces a tightly packed byte array, potentially resulting in more collisions than abi.encode
. This can be a concern when using hashes to represent data in smart contracts, as it could potentially allow an attacker to manipulate the contract by providing a different input that produces the same hash as the original input.
That said, the probability of a hash collision occurring is still relatively low. Proper testing and validation of the inputs used to generate the hash can mitigate this possibility. Additionally, you can also use larger hash sizes to reduce the likelihood of a collision occurring.
Opcodes
Opcodes are the low-level instructions that the EVM understands. Solidity provides a way to access these opcodes directly through the assembly language. Assembly language is a low-level programming language that uses mnemonics to represent opcodes.
Here are some of the commonly used opcodes in Solidity:
mstore
This opcode stores a value in memory at a specified address. The first argument is the memory address, and the second argument is the value to be stored. For example:
function test() public pure {
assembly {
mstore(0, 42)
}
}
In this example, the test
function uses assembly language to set the first value.
calldatacopy
This opcode copies data from the calldata (the input data to a function call) to memory. The first argument is the memory address to copy the data to, the second argument is the calldata offset to start copying from, and the third argument is the number of bytes to copy.
For example:
function test() public pure {
assembly {
calldatacopy(0, 0, calldatasize())
}
}
In this example, the test
function uses assembly language to copy the entire calldata into memory.
add
The add opcode adds two values together. The first argument is the value to add, and the second is the value to add. For example:
function test() public pure {
uint256 a = 123;
uint256 b = 456;
assembly {
add(a, b)
}
}
In this example, the test
function uses assembly language to add the values a
and b
together.
sstore
The sstore opcode stores a value in storage at a specified key. The first argument is the storage key, and the second argument is the value to be stored. For example:
function test() public {
uint256 key = 123;
uint256 value = 456;
assembly {
sstore(key, value)
}
}
In this example, the test
function uses assembly language to store the value value
at the storage key key.
Optimization using Opcode
Solidity provides built-in functions for arithmetic operations like addition, subtraction, multiplication, and division. However, using opcodes directly can be more efficient regarding gas usage. For example, instead of using Solidity’s add
function, you can use the ADD
opcode directly to add two numbers.
Look at the example below:
Unoptimized code
pragma solidity ^0.8.0;
contract OpcodeExample {
uint256 public result;
function add(uint256 a, uint256 b) public {
result = a + b;
}
function sub(uint256 a, uint256 b) public {
result = a - b;
}
}
In the above code, we define a simple contract, OpcodeExample,
with two functions, add
and sub,
that performs basic arithmetic operations and stores the result in a state variable result.
Optimized code:
pragma solidity ^0.8.0;
contract OpcodeExample {
uint256 public result;
function add(uint256 a, uint256 b) public {
assembly {
result := add(a, b)
}
}
function sub(uint256 a, uint256 b) public {
assembly {
result := sub(a, b)
}
}
}
In the optimized code, we use the assembly
keyword to directly write low-level EVM code to perform the arithmetic operations instead of using Solidity’s built-in functions. In this case, we use the add
and sub
opcodes to add and subtract the input values. By using opcodes directly, we can avoid the overhead of calling Solidity’s built-in functions, resulting in a more gas-efficient contract.
Using opcodes directly can be more complex and error-prone than using Solidity’s built-in functions, so it’s essential to thoroughly test and verify the optimized code to ensure it’s functioning correctly.
It’s important to note that while using opcodes directly can result in gas savings, it’s not always the best approach for optimization. In some cases, Solidity’s built-in functions may be more gas-efficient or easier to read and maintain. It’s essential to carefully consider the specific use case and performance requirements when optimizing Solidity code.
Conclusion
ABI encoding and opcodes are low-level features that allow developers to interact with the EVM at a lower level than Solidity’s high-level syntax. ABI encoding encodes and decodes data for communication between the Ethereum blockchain and external systems. Solidity provides a built-in ABI encoder, as well as functions for encoding data tightly packed or without the function signature. Opcodes are the low-level instructions that the EVM understands, and Solidity provides a way to access these opcodes directly through the assembly language. Understanding these low-level features can help optimize gas usage and write more efficient smart contracts on the Ethereum blockchain.
References
- Solidity documentation on ABI encoding
- Solidity documentation on opcodes
- “Understanding Solidity Assembly using Simple Examples” by Patrick McCorry
- “Using Assembly in Solidity” by ConsenSys
- “Gas-Efficient Solidity Smart Contracts” by OpenZeppelin
Author
Oyeniyi Abiola Peace is a seasoned software and blockchain developer. With a degree in Telecommunication Science from the University of Ilorin and over five years experience in JavaScript, Python, PHP, and Solidity, he is no stranger to the tech industry. Peace currently works as the CTO at DFMLab. When he’s not coding or teaching, he loves to read and spend time with family and friends.