Functions To Transform Base Data
hex() & unhex()
All columns in the all_chains
table that can contain a hexadecimal string are stored in byte code. For example:
The columns stored in byte code include:
block_hash
block_parent_hash
tx_hash
tx_sender
tx_recipient
log_emitter
topic0
-topic3
data0
-data3
data_rest
The hex()
function is used to transform byte code (e.g 4�f�P��Fq
) into it's hexadecimal representation (e.g 0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2
) whenever you want to display addresses or join onto another table.
The unhex()
function is used to transform hexadecimal strings (e.g 0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2
) into byte code (e.g 4�f�P��Fq
) whenever you want to parse an address to the database through a WHERE
statement.
Example
unhex()
Lets assume you want to find the top 10 addresses on Avalanche by transaction count. Your query would look something like this:
SELECT uniq(tx_hash) as count , tx_sender FROM blockchains.all_chains WHERE chain_name = 'avalanche_mainnet' GROUP BY tx_sender ORDER BY count DESC LIMIT 10
The only problem is your result table would be displayed like this on your dashboard:
count | tx_sender |
---|---|
1791218 | �<�r�:�t�l<�ȹ |
648201 | IY�Uǐ�<�&2R@� |
534656 | P�;'�p�z�e=h� |
433241 | �ķ�3I:�ہ�/ |
373256 | {-Wk�!A�n� $�k |
355467 | *��!�A!��i |
314395 | �{�}]i�D��u�g� |
302360 | �{�}]i}-{P�u�g� |
297510 | ,�(WIs�o�-�K�} |
268612 | �{�*&i}-{P�u'g� |
Displaying addresses in byte code stops the viewer from investigating further in a query or block explorer. Therefore, you can use the hex()
function to make these addresses readable.
SELECT uniq(tx_hash) as count , hex(tx_sender) as tx_sender FROM blockchains.all_chains WHERE chain_name = 'avalanche_mainnet' GROUP BY tx_sender ORDER BY count DESC LIMIT 10
The result:
count | tx_sender |
---|---|
1791218 | 9F8C163CBA728E99993ABE7495F06C0A3C8AC8B9 |
648201 | 491E59C255C790D4E3A53CEC2632524088F1AAA4 |
534656 | 50FF3B278FCC70EC7A9465063D68029AB460EA04 |
433241 | B9F79FC4B7A2F5FB33493AB5D018DB811C9C2F02 |
373256 | 187B2D576BA7EC2141C180A96EDD0F202492F36B |
355467 | 2A038E100F8B85DF21E4D44121BDBFE0C288A869 |
314395 | A67BEB7D5D69ECFF44891690D5E175A3DEF367AE |
302360 | 616FBE93B86A49A5718B5ADBE07C837E2FAE4CC1 |
297510 | 2CBC862857490C73D46F9F142DFD4B14DF027D16 |
268612 | 88888846B627C2405C4B8963E45D731B7CDDA406 |
unhex()
Lets assume you want to find the daily transaction count for Aave's V2 Lending Pool whose contract address is 0x7d2768dE32b0b80b7a3454c06BdAc94A69DDc7A9
. Your query would something like this:
SELECT uniq(tx_hash) as count , date_trunc('day', signed_at) as date FROM blockchains.all_chains WHERE chain_name = 'eth_mainnet' AND tx_recipient = unhex('7d2768dE32b0b80b7a3454c06BdAc94A69DDc7A9') -- Aave's V2 Lending Pool Contract. AND signed_at > '2021-01-01' AND signed_at < '2021-01-10' -- Using this date range as an example. GROUP BY date
The tx_recipient
column in the all_chain
table is stored as byte code. Therefore, to filter results by the tx_recipient
(Aave's V2 Lending Pool) it must be passed to the database in byte code. The unhex()
function enables you to transform 7d2768dE32b0b80b7a3454c06BdAc94A69DDc7A9
into it's byte code representation and filter results by the tx_recipient
.
extract_address()
Contract and wallet addresses are stored differently depending on the columns they're in. For example, lets compare how the WETH contract is stored in the log_emitter
and topic1
column.
SELECT hex(log_emitter) as log_emitter FROM blockchains.all_chains WHERE topic0 = unhex('7FCF532C15F0A6DB0BD6D0E038BEA71D30D808C7D98CB3BF7268A95BF5081B65') AND tx_hash = unhex('D33F0D86C1EC1C5B7C5A5A068189C9189FB549292759DFA3EFAE73DD0C65B6D4') UNION ALL SELECT hex(topic1) as topic1 FROM blockchains.all_chains WHERE topic0 = unhex('8BC254D6F262E53C6F080B869501CA562EE1252C6EF58D59C4758BB14260A095') AND tx_hash = unhex('0EB1A4415CE7E2C23B053232FD686C4007C027EA0F65A832DEC0D7845CC24FBC') ---------------- The Result --------------------- -- log_emitter = C02AAA39B223FE8D0A0E5C4F27EAD9083C756CC2 -- topic1 = 000000000000000000000000C02AAA39B223FE8D0A0E5C4F27EAD9083C756CC2
As you can see, hex(log_emitter)
returns a version of the WETH token contract that's often displayed on CoinGecko or Metamask. However, hex(topic1)
returns a version of the WETH token contract with 24 leading zeros. To remove these leading zeros from the data or topic field, use the extract_address()
function which selects the right most 40 characters in a string.
Example
Using the code example above, you can transform the topic1
field into a normal address using the extract_address()
function.
SELECT hex(log_emitter) as log_emitter FROM blockchains.all_chains WHERE topic0 = unhex('7FCF532C15F0A6DB0BD6D0E038BEA71D30D808C7D98CB3BF7268A95BF5081B65') AND tx_hash = unhex('D33F0D86C1EC1C5B7C5A5A068189C9189FB549292759DFA3EFAE73DD0C65B6D4') UNION ALL SELECT extract_address(hex(topic1)) as topic1 FROM blockchains.all_chains WHERE topic0 = unhex('8BC254D6F262E53C6F080B869501CA562EE1252C6EF58D59C4758BB14260A095') AND tx_hash = unhex('0EB1A4415CE7E2C23B053232FD686C4007C027EA0F65A832DEC0D7845CC24FBC') ---------------- The Result --------------------- --C02AAA39B223FE8D0A0E5C4F27EAD9083C756CC2 log_emitter --C02AAA39B223FE8D0A0E5C4F27EAD9083C756CC2 topic1
extract_address()
is used when extracting an address from topic1-3
and all data fields (e.g data0
, data1
... dataN
).
data_field()
The following image matches the different parts of a transaction log to columns in the all_chains
table.
As you can see, data0 - data3
hold the information from individual rows in the data field. The data_rest
column holds all the information in the data fields outside data0 - data3
. To query individual rows from this blob of data, use data_field()
.
Example
The data_field()
is very rarely used in analysis. Lets assume you are trying to finding the token amount bridged through LI.FI - for simplicity reasons we won't query the token address nor the USD amount bridged. To start, let's look at the bridge's on-chain event.
As you can see the amount field is in the seventh position; however, the block explorer actually displays this amount in the incorrect position. The block explorer should be displaying this information in the fifth data field. The data_field()
function decodes data field 5 using the following SQL syntax:
SELECT to_u256_raw(abi_field(data_rest, 5)) FROM blockchains.all_chains WHERE tx_hash = unhex('98c43948bca05770d8a5d6e5ba8e2c4822dffd9d936b097667da113316f157f6') -- Li.Fi Bridge Transactions AND topic0 = unhex('438f81f3fe94456cd9d98e9073524f1c2bafb3ce75def8ced69f708061ddd5c4') AND chain_name = 'eth_mainnet' -- Result: 44000000000000000
The to_u256_raw() function is explained in a different section of this page.
to_float64_raw() & to_float64_adj()
Numbers stored on the blockchain are formatted as hexadecimal strings - as seen below.
To query a number from the topic or data fields, use the to_float64_raw()
function. The code sample below is querying a number from the topic3
field:
SELECT to_float64_raw(data0) FROM blockchains.all_chains WHERE chain_name = 'eth_mainnet' AND tx_hash = unhex('E207FF3ED73F1BBEA87B64B3D41025EADCC867A5ADECE7FB3CBC0731F7F7DA2F') -- Transfer Event
When doing math in your SQL query, it's crucial that your numbers are in the same data type. Using different data types will return an error.
For example, multiplying token amounts by token prices is a very common practice when doing analysis. However, prices in the token_prices
table are stored as a floating point number while numbers stored on the blockchain are always integers. To multiply these two numbers the token amount needs to be transformed from an integer to a floating point number.
In addition, token amounts will often need to be divided by 10^(number decimal)
. To shorthand this equation, use the to_float64_adj(x,y)
function..
In the to_float64_adj(x,y)
function, x is the data or topic field which holds the token amount (integer) and y is the number decimal for the given token. As seen in the code example below, the to_float64_adj(x,y)
function is the same as to_float64_raw(x)/10^y
.
Example
Lets assume you need to find the total amount deposited into the Aave V2 Lending Pool.
- The pool's contract address is
0x7d2768dE32b0b80b7a3454c06BdAc94A69DDc7A9
- The deposit event has the following topic event hash
0xde6857219544bb5b7746f48ed30be6386fefc61b2f864cacf559893bf50fd951
In the following code example, you are summing the token amount and then multiplying it by the token's price. The second and third SELECT
statements showcase how to use the to_float64_raw()
and to_float64_adj()
functions. These functions are transforming the data1
field (stored as byte code) into a floating point number.
SELECT [signed_at:aggregation] as date , sum(prices.price_in_usd * to_float64_raw(data1)/pow(10, prices.num_decimals)) , sum(prices.price_in_usd * to_float64_adj(data1, prices.num_decimals)) -- These two SELECT statements return the same result FROM blockchains.all_chains e LEFT JOIN ( SELECT contract_address, dt, price_in_usd, num_decimals FROM reports.token_prices prices WHERE [signed_at:daterange] AND chain_name = 'eth_mainnet' ) prices ON prices.contract_address = lower(extract_address(hex(topic1))) AND date_trunc('day', e.signed_at) = prices.dt WHERE log_emitter = unhex('7d2768dE32b0b80b7a3454c06BdAc94A69DDc7A9') AND topic0 = unhex('de6857219544bb5b7746f48ed30be6386fefc61b2f864cacf559893bf50fd951') AND chain_name = 'eth_mainnet' AND [signed_at:daterange] GROUP BY date
to_u256_raw()
Numbers stored on the blockchain are often formatted as hexadecimal strings - as seen below.
Typically when trying to transform a data and topic field into a number you want to use the to_float64_raw()
function. Reason being is this number can then be multiplied by other floating points numbers such as token prices. However, some numbers on the blockchain are too large to fit into a floating point number format. When that's the case, use the to_u256_raw()
function. The example below will discuss a scenario where to_u256_raw()
has to be used instead of to_float64_raw()
.
SELECT to_u256_raw(data0) FROM blockchains.all_chains WHERE chain_name = 'eth_mainnet' AND tx_hash = unhex('E207FF3ED73F1BBEA87B64B3D41025EADCC867A5ADECE7FB3CBC0731F7F7DA2F') -- Transfer Event
Example
As stated above, in most circumstances the to_float64_raw()
function is the best function for transforming data and topic fields into numbers. The to_u256_raw()
function is required when dealing with numbers that are to big to fit in a float64 data type. An example of such a number is the Token ID of NFTs minted through Opensea's OPENSTORE contract.
Lets assume you want to build a non-custodial wallet that displays the historical transfers of the NFTs in a user's wallet. The following example is going to be looking at the transfer history for the NFT displayed below. For reference, the NFT has been transferred once.
The image below is an example of a transfer event for this NFT.
The id row in the event's data field displays the Token ID of the NFT getting transferred; however, it's far too large to be in a float64 data type. To demonstrate this, the query below is filtering the data0
column by the Token ID (76538441175747...
) in the WHERE
statement using to_float64_raw()
.
-- Using to_float64_raw() SELECT signed_at , hex(tx_hash) FROM blockchains.all_chains WHERE topic0 = unhex('c3d58168c5ae7397731d063d5bbf3d657854427343f4c083240f7aacaa2d0f62') AND to_float64_raw(data0) = '76538441175747641219265072182160044491633700610231652884771900196464411803649' -- Token ID
The result is 22 transactions even though this NFT has only be transferred once. The table below displays the first 10 rows:
signed_at | tx_hash |
---|---|
2022-12-05 08:13:38 | B6B41F56A7F3BD50853AD0334497C33FBF4A87FE74135E23B3E7682036D7047B |
2022-12-05 08:16:20 | A5613AC17D1BBEBC367089CD2550B0FA06D76E5F273DB6551E93A78DBBF2320F |
2022-12-05 08:19:12 | AAB71C80DD9151F0E92B59AD37C3EE32714AEC9BDC75EE9A7C31D58039072793 |
2022-12-05 08:22:18 | EC4911C11421DD86146E70A1CCC38498DEA3B715363D95ADAD6C2A9E6A506F07 |
2022-12-05 09:56:04 | 142338BE1745E19AD0092904344AE68B5D74406DED3282DA4D99189C8AE7834E |
2022-12-05 10:00:14 | 4678A9BFE168B77929AD238CBE5D2A51CB57E54C799562BAC98FAE7D0280AE5E |
2022-12-05 10:44:55 | 48FFE29C748A91B95CDA9286ACABC403F255EF4E15F65D98D3E1CA1DF701BEB2 |
2022-12-05 22:16:30 | 93152B8C49967B0BDA2E4310B7CF185B2E6998ADE4322E496E73D50241D7B6B0 |
2022-12-05 22:18:36 | CEF8EF3378B3449DF8F0899C2F4B99F80360EF022F1BB000C900DCD201CEED8B |
2022-12-05 22:20:44 | 3FB1488B0E01CCC1F407BE4B7AA8F07B696B40A270CDC9C6D2D44178FC6DB123 |
Next lets use the to_u256_raw()
function to see if we get the same results:
-- Using `to_u256_raw()` SELECT signed_at , hex(tx_hash) FROM blockchains.all_chains WHERE topic0 = unhex('c3d58168c5ae7397731d063d5bbf3d657854427343f4c083240f7aacaa2d0f62') AND to_u256_raw(data0) = '76538441175747641219265072182160044491633700610231652884771900196464411803649'
The result is a single row that returns all the information for the correct transfer event.
signed_at | tx_hash |
---|---|
2022-12-05 08:22:18 | EC4911C11421DD86146E70A1CCC38498DEA3B715363D95ADAD6C2A9E6A506F07 |
Now you can turn this query into a Class C endpoint and start building your wallet application.
toFloat64()
When doing math in your SQL query, it's crucial that your numbers are the same data type. Using different data types in a calculation will return an error (e.g string * integer). Numbers are represented in two main data types:
- Integer - whole numbers.
- Floating Point - decimal numbers.
A floating point number - often referred to as a float - is a data type for numbers with a decimal point (e.g 4.2 and 0.69). Floating point numbers get their name from the way the decimal point can float to any position necessary.
To transform values into a floating point number, use the toFloat64()
function.
Example
Lets assume you are trying to calculate the average transaction fee on Avalanche mainnet over the last 7 days. In the code example below, the integer value tx_gas_price
is getting transformed into a floating point number so it can be multiplied by price_in_usd
and tx_gas_spent/pow(10, 18)
- both of which are floating point numbers.
SELECT avg((tx.tx_gas_spent/pow(10, 18))* toFloat64(tx.tx_gas_price)* prices.price_in_usd) as average_gas_cost FROM ( SELECT any(tx_hash), tx_gas_spent, tx_gas_price, signed_at FROM blockchains.all_chains WHERE chain_name = 'avalanche_mainnet' AND [signed_at:daterange] GROUP BY tx_gas_spent, tx_gas_price, signed_at ) tx LEFT JOIN reports.token_prices prices ON date_trunc('day', tx.signed_at) = prices.dt AND prices.contract_address = 'b31f66aa3c1e785363f0875a1b74e27b85fd66c7' GROUP BY date
If you were to remove the toFloat64()
function you would receive the following error:
Illegal types Float64 and Int256 of arguments of function multiply: While processing (tx_gas_spent / pow(10, 18) * tx_gas_price) * price_in_usd
. (ILLEGAL_TYPE_OF_ARGUMENT))