The dataset pertains to the collection and analysis of blockchain execution data, particularly from Ethereum-based Decentralized Applications (DApps). This data includes transactions, transaction receipts, and detailed transaction traces, documenting the execution steps performed by the Ethereum Virtual Machine (EVM). Such traces are essential for understanding the interaction between smart contracts and accounts, including Contract Accounts (CAs) and Externally Owned Accounts (EOAs).
A blockchain is an append-only ledger that chronologically records data in blocks. Each block contains transactions that signify state transitions, and transaction receipts that provide a hashed result of these transitions to ensure uniform results across different executions. The dataset includes a classification of Ethereum accounts, detailing the functions and interactions between EOAs and CAs, where CAs deploy and execute smart contract code.
The dataset captures the granular operational data of blockchain transactions, such as function calls, contract creations, and log entries generated by smart contracts. These details are crucial for creating object-centric event logs, aiding in process mining and analysis to bridge the gap between theoretical process models and actual execution.
Contract creations and function calls are fundamental components of the dataset. The former documents the deployment of smart contracts, including the mechanics of contract updates and additions through various design patterns. Function calls between accounts are also extensively logged, providing insights into the flow of Ethereum's native token, Ether, and other transactional data within the blockchain.
Delegated calls and log entries represent more specialized interactions within Ethereum, where delegated calls allow contracts to use code from other contracts to manipulate their own state, supporting upgradeable contract designs. Log entries, specified within smart contract code, facilitate the communication of contract execution details to external systems.
To handle the diverse and dynamic nature of blockchain data, the dataset employs the Object-Centric Event Log (OCEL) format. This format accommodates multiple object types in a single log, addressing issues such as event divergence and convergence, typical of traditional single-case logs. The latest version, OCEL 2.0, supports documenting dynamic object roles and relationships, improving the fidelity of logs in capturing blockchain operations.
In summary, the dataset is structured to support a comprehensive analysis of blockchain behaviors, particularly focusing on Ethereum DApps. It is tailored to assist researchers and practitioners in understanding and analyzing the decentralized execution of smart contracts and the associated data flows within the blockchain environment.