Carved in Stone: Malware [ab]use of the Bitcoin Blockchain
Bitcoin properties make it a resilient mechanism for establishing decentralized and unregulated communication channels between parties that need to share small amounts of data.
To our knowledge, at least five malware families have abused the Bitcoin blockchain as a communication channel for two tasks:
(1) signaling the location of botnet C&C centers and
(2) automatically delivering the decryption key for data held hostage from ransomware attacks.
In this talk, we will show how we build signatures from the Bitcoin transactions produced by addresses of these malware families, use them to identify more of their [previously unknown] addresses, and search for their economic relations with other entities, which aid their attribution. Besides, we will show how this technique allowed us to measure, for the first time, the impact of the coverage in the revenue estimation of a ransomware operation, which is 39 times larger than the value reported by another commonly used technique.
In this talk, we will present a set of techniques developed in [1] and [2] for building blockchain signatures from malware-related Bitcoin addresses, tracing the transactions produced by such addresses to find previously unknown addresses of the same family and their economic relationships with other entities, and uncovering a (near) complete ransomware operation. As the starting point, we focus our analysis on sets of Bitcoin addresses confirmed to be related to a malware family, which we call the seeds. Among them, we identify five malware families using four different transaction patterns that allow us to build transaction signatures.
To produce a signature, we extract information from deposit and withdrawal transactions generated by our seeds, such as the BTC amounts sent or received, the sender or receiver address scripts, the number of transaction input and output slots, etcetera. We build signatures for five malware families: Cerber (ransomware), Pony (data stealer), Skidmap (cryptojacking), Glupteba (trojan), and DeadBolt (ransomware). By using transaction signatures, we identify other addresses that behave the same as the seeds. So, we trace all the blockchain transactions related to these addresses using a novel back-and-forth exploration [1]. This exploration allows us to produce an exploration graph containing Bitcoin addresses that transact with the seeds, thus establishing economic relationships with them.
Besides, by applying the signatures on the new addresses found, we can discover Bitcoin addresses that behave the same as the seeds. Both facts (money flow and same behavior) allow us to conclude that they belong to the same family. In addition, we confirm this conclusion by analyzing the IOCs produced within the different operations. During the exploration, we face one main issue: the detection of a change of ownership, which is an open research problem and may provoke an explosion of the exploration graph if not handled correctly. We address it by identifying entities during the exploration. For this, we use two widely adopted techniques. First, we expand seeds using the multi-input clustering heuristic [3, 4, 5], which allows us to find more Bitcoin addresses controlled by the same entity. Second, to identify the real-world entity that handles a group of addresses (e.g., services like exchanges), we use a database of tags [6, 7].
However, tags are not complete and suffer a lack of coverage. So, we complement the tag database with an exchange classifier able to identify addresses that belong to crypto exchanges (the most common service in Bitcoin). Additionally, we use a worklist to prioritize exploring Bitcoin addresses with a low explosion rank, derived from the number of input and output slots of all the transactions of a Bitcoin address, which effectively excludes addresses that likely belong to unidentified services if setting a maximum value for it. As the output of the exploration, we produce a graph with all the addresses found to be transacting with the seeds until detecting a change of ownership.
We finally perform a directed path search to find attribution points in the graph, i.e., economic relationships between all identified entities, such as withdrawals from malware-controlled addresses to services (cashouts) or deposits from services to malware-controlled addresses (funding deposits). Additionally, due to the specific properties of DeadBolt transactions, the DeadBolt signature allows us to find more Bitcoin addresses that have received ransom payments from victims of this ransomware family, so we use it to traverse the Bitcoin blockchain from two months before the date of the first report of a victim up to mid-2023, which let us approximate with high precision the economic impact of DeadBolt on its victims. Due to DeadBolt infecting NAS servers exposed to the Internet, some ransom notes are publicly available and are prone to be collected by Internet scanners. We use two popular Internet scanners to collect DeadBolt ransom notes as a baseline to compare our results. We measure a 39-fold increase in the coverage provided by our signature-based approach compared to that estimated using Internet scanners.
About the Speaker
Gibran Alberto Gómez Montez is a Ph.D. student at the Technical University of Madrid (UPM) and a research assistant at the IMDEA Software Institute, working under the supervision of Dr. Juan Caballero. During his Ph.D., he developed novel methodologies for identifying malware families that misuse the TLS security protocol and threat intelligence tools for extracting, collecting, and evaluating IOCs. He has also developed tracing techniques and classifiers for blockchain analysis. On one hand, these techniques enable tracing malicious operations that abuse the Bitcoin blockchain, allowing the identification of economic relations between them and other entities, which aids their attribution. On the other hand, they lead to better revenue estimations. He obtained a Computer Engineering degree from the Technological University of the Mixteca (UTM) and an M.Sc. in Software and Systems from UPM. His research interests encompass computer security, malware analysis, blockchain technologies, and machine learning.