Description
This dataset contains the Pagerank values and rankings of Bitcoin addresses and transaction IDs (TXID). It contains a total of 1.608.748.675 addresses or TXIDs.
Part 1 is available at https://zenodo.org/record/6052811
File format
The dataset is compressed with bzip2. It can be uncompressed using the command bunzip2. The dataset is divided into multiple files since it was large. The files are space-delimited plain text files and have the following five fields:
Label: A alphanumeric Bitcoin address (e.g. 1DzTCMmWABEDM1rYFL1RgdLyE59jXMzEHV) or a 64 character hexadecimal transaction ID (e.g. 000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51) Type: String
Label type: It's value is 0 if the label is transaction ID and 1 if the label is a Bitcoin address. Type: Integer
Rank: Unique Pagerank rank where the ties (addresses having the same Pagerank value) are resolved by sorting the addresses. Type: Integer
Rank with ties: Pagerank rank where the ties (addresses having the same Pagerank value) have the same rank. Type: Integer
Pagerank value: Pagerank of the address and transaction IDs calculated using Pagerank algorithm. Type: Floating-point number
Sample lines:
000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51 0 427225664 266976712 0.979246 1DzTCMmWABEDM1rYFL1RgdLyE59jXMzEHV 1 1114666798 508037940 0.877961
Dataset Generation
The Bitcoin transactions between blocks 0 (mined on 03.01.2009) and 713.999 (mined on 13.12.2021) are extracted. A transaction graph is constructed, where Bitcoin addresses and transaction IDs are nodes of the graph and the transaction inputs and outputs are edges of the graph. Pagerank is applied on this transaction graph. This computation is performed using the system presented in the paper 'Parallel analysis of Ethereum blockchain transaction data using cluster computing'.
Note
If you use our dataset in your research, please cite our paper: https://link.springer.com/article/10.1007/s10586-021-03511-0
@article{kilic2022parallel, title={Parallel Analysis of Ethereum Blockchain Transaction Data using Cluster Computing}, journal={Cluster Computing}, author={K{\i}l{\i}{\c{c}}, Baran and {"O}zturan, Can and Sen, Alper}, year={2022}, month={Jan} }
Other Datasets
If you are interested, please also check out our Pagerank Dataset for Ethereum Blockchain.