DAPATKAH MODEL TRANSFORMER MENDETEKSI TOKEN SCAM? SEBUAH STUDI PADA SMART CONTRACT ERC-20
DOI:
https://doi.org/10.31539/h46hgh88Abstract
Pertumbuhan pesat ekosistem blockchain telah melahirkan ribuan token pada jaringan ERC-20. Fenomena ini mendorong inovasi finansial, namun sekaligus meningkatkan risiko penipuan melalui smart contract yang menyembunyikan mekanisme berbahaya seperti backdoor, blacklist bot, dan manipulasi fee. Penelitian ini mengusulkan pendekatan klasifikasi berbasis Transformer secara end-to-end untuk mendeteksi token scam ERC-20 menggunakan kode sumber Solidity sebagai satu-satunya fitur masukan. Tiga model dievaluasi: CodeBERT (microsoft/codebert-base), RoBERTa (roberta-base), dan GraphCodeBERT (microsoft/graphcodebert-base). Dataset terdiri dari 60.000 kontrak ERC-20 yang diambil dari repositori ASSERT-KTH/DISL, dengan 30.000 kontrak dilabeli secara semi-otomatis menggunakan analisis kode statis berbasis aturan (rule-based) untuk tujuh jenis scam: Honeypot, High Tax, Balance Manipulation, Blacklist, Hidden Owner, Rug Pull, dan Unlimited Mint. Pada klasifikasi biner, GraphCodeBERT mencapai performa terbaik dengan F1-Score 0,9295 dan AUC 0,9808. Pada klasifikasi multilabel, RoBERTa unggul pada F1-Score (0,8681) sementara GraphCodeBERT unggul pada AUC (0,9672). Label Blacklist menjadi tantangan tersendiri dengan F1-Score hanya 0,61–0,64 akibat ketidakseimbangan kelas yang ekstrem. Hasil penelitian membuktikan bahwa representasi kode sumber Solidity melalui model Transformer sudah cukup informatif untuk membedakan kontrak scam dari kontrak legitim secara otomatis.
References
[1] F. A. Bakare et al., "Decentralized Finance (DeFi): Opportunities and Risks," IEEE Access, 2024.
[2] C. Zhang, "Token Security Analysis on Ethereum: ERC-20 Smart Contract Vulnerabilities," arXiv preprint, 2023.
[3] P. Xia et al., "Trade or Trick? Detecting and Characterizing Scam Tokens on Uniswap Decentralized Exchange," Proceedings of the ACM on Measurement and Analysis of Computing Systems, 2021.
[4] Y. Wei et al., "Automated Smart Contract Security Analysis Using Large Language Models," arXiv preprint, 2025.
[5] Z. Feng et al., "CodeBERT: A Pre-Trained Model for Programming and Natural Languages," in Findings of EMNLP 2020, ACL, 2020.
[6] D. Guo et al., "GraphCodeBERT: Pre-training Code Representations with Data Flow," in ICLR 2021, OpenReview, 2021.
[7] Y. Liu et al., "RoBERTa: A Robustly Optimized BERT Pretraining Approach," arXiv:1907.11692, 2019.
[8] L. Chen et al., "Detecting Ponzi Schemes on Ethereum: Towards Healthier Blockchain Technology," in WWW 2018, ACM, 2018.
[9] Y. Jin and S. Li, "Smart Contract Semantic Search Based on Fine-tuned CodeBERT," Wuhan University Journal of Natural Sciences, 2023.
[10] J. Zhang et al., "Smart Contract Vulnerability Detection Using Code Embeddings and Data Flow Graphs," arXiv preprint, 2024.
[11] T. Bu et al., "SmartBugBERT: BERT-Based Smart Contract Vulnerability Detection," IEEE Transactions on Dependable and Secure Computing, 2025.
[12] M. L. Zhang and Z. H. Zhou, "A Review on Multi-Label Learning Algorithms," IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 8, pp. 1819–1837, 2014.
[13] G. Morello, M. Eshghie, S. Bobadilla, and M. Monperrus, "DISL: Fueling Research with A Large Dataset of Solidity Smart Contracts," arXiv:2403.16861, 2024.
[14] F. Schär, "Decentralized Finance: On Blockchain- and Smart Contract-Based Financial Markets," Federal Reserve Bank of St. Louis Review, 2021.
[15] S. Jiang et al., "Smart Contract Vulnerability Detection: From Pure Neural Network to Interpretable Graph Feature and Expert Pattern Fusion," in IJCAI 2023, 2023.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Andhi Saputro, Makhsun Makhsun, Ahmad Musyafa

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

