A curated list of Malware and Benign datasets for security researchers.
Dataset | Description | Link | Public/Private |
---|---|---|---|
MALNET-IMAGE | A large-scale dataset of 1,262,024 malware images across 696 families for research in malware classification. | Link | Public |
Virus-MNIST | A dataset of 51,880 grayscale images of malware, designed for malware classification tasks, with 10 classes. | Link | Public |
Malimg | A dataset of 9,458 images of PE malware, categorized into 25 different families. | Link | Public |
Stamina | A dataset containing 782,224 binary sequences converted to images, designed for malware classification. | Link | Public |
McAfee | A dataset of 367,183 malware samples analyzed by McAfee, categorized into two main types. | Link | Private |
Kancherla | A smaller dataset with 27,000 samples focused on binary classification of malware and benign files. | Link | Private |
Choi | A dataset of 12,000 samples, split evenly between malware and benign, for binary classification tasks. | Link | Private |
Fu | A dataset of 7,087 samples from 15 different malware families, designed for multi-class classification. | Link | Private |
Han | A dataset of 1,000 samples across 50 malware families, intended for fine-grained malware classification. | Link | Private |
IoT DDoS | A small dataset containing 365 samples for IoT Distributed Denial of Service (DDoS) attack detection, with 3 distinct attack types. | Link | Public |
DikeDataset | Binaries of PE malware and benign samples. | Link | Public |
Benign-NET | Binaries of PE benign samples. | Link | Public |
Ember | Features of PE malware. | Link | Public |
Virushare | Binaries of PE malware samples (requires permission for access). | Link | Private |
Microsoft Malware Prediction | PE malware features in CSV format. | Link | Public |
Microsoft Malware Classification Challenge (BIG 2015) | Binaries of PE malware. | Link | Public |
malware_benign_file | Binaries of PE malware and benign samples. | Link | Public |
dumpware 10e | 4,294 RGB images from 3,686 malware samples and 608 benign samples, with images rendered in various width schemes. | Link | Public |
CICIDS 2017 Dataset | Contains network traffic data including benign and malicious samples, with detailed labels for various types of attacks. | Link | Public |
Kaspersky Malware Dataset | A collection of malware samples collected and analyzed by Kaspersky, useful for classification and behavioral analysis. | Link | Private |
CICIDS 2018 Dataset | Network traffic data including benign and malicious samples with detailed attack labels and features. | Link | Public |
AILab Malware Dataset | Provides malware samples for various research purposes, including behavioral analysis and classification. | Link | Private |
MalNet Dataset | A dataset of malware samples collected from various sources, useful for malware detection and analysis. | Link | Public |
Contagio Malware Dump | Contains a variety of malware samples used for malware research and analysis. | Link | Public |
The Microsoft Malware Classification Challenge (BIG 2018) | Contains malware samples and features with labels for various malware types. | Link | Public |
MalMem2021 Dataset | A dataset of memory dumps containing both benign and malicious processes, useful for memory forensics. | Link | Public |
CICIDS 2019 Dataset | Network traffic data including benign and malicious samples with comprehensive attack labels. | Link | Public |
Malware Bazaar | A collection of malware samples shared by the community for research purposes. | Link | Public |
BODMAS | Contains 57,293 malware and 77,142 benign Windows PE files, including binaries (disarmed malware only), feature vectors, and metadata. | Link | Public |
Contributions are welcome! Please follow the contribution guidelines for submitting new datasets or updates.
This repository is licensed under a Creative Commons Attribution 4.0 International License.