streamlit run app.py
Mohamed Benchikh
-
Static: Features are extracted from PE file headers (mainly Optional Header), Yara rules and digital signature.
-
Dynamic: Features are the API calls traced using Cuckoo Sandbox
- Static
Malware samples were acquired from MalwareBazaar while benign samples were acquired from multiple online hosting websites (ie. CNET) we then used pefile module in Python to parse PE headers and extract relevant features (chosen using benchmarks), we also used Yara capabilities, digital signature, and packing as features
- Dynamic
we tweaked the APIMDS dataset from hksecurity and changed it from a dataset of API calls sequences to a dataset of binary values with predetermined features
We compared multiple algorithms using a 10-Fold stratified cross validation process algorithm, we settled on Extreme Gradient Boosting (XGBoost) classification algorithm as it had the highest F1 score