INTELLIGENT MALWARE DETECTION USING NEURAL ARCHITECTURES: A COMPARATIVE STUDY OF CNN, LSTM, FNN AND BI LSTM

Nabia Shaheen; Sagar Lohana; Muhammad Ramzan

Authors

Nabia Shaheen
Sagar Lohana
Muhammad Ramzan

Keywords:

INDEX TERMS LSTM), Long Short-Term Memory, (CNN) Convolutional Neural Networks, Feed forward Neural Networks (FNN), Bidirectional LSTM (Bi-LSTM), machine learning, deep learning

Abstract

Identification and classification of malware is still a major problems in the area of cyber security because of consistent appearance of new variants. The paper discusses our approach to develop an accurate method for virus detection using the complementary skills of artificial intelligence. Malware detection has become a difficult task with the appearance of clever dangerous programs, which exist undetectable by signature-based antivirus process, such as mutating malware. The paper is dedicated to the use of deep learning models, especially Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), Feed forward Neural Networks (FNN), and Bidirectional LSTM (Bi-LSTM), for malware detection through the analysis of Windows executable API call sequences. Pre-processing the data involved two steps: tokenizing the API calls and disassociating the malware exemplars from the encoded binary values. Data sequencing pattern identification was ensured by adopting suitable loss functions, optimizers, and validation methods when training models. The CNN model was the most accurate among them, almost 92%, precisely because of its Conv1D and MaxPooling layers that catch spatial patterns very well. The LSTM and Bi-LSTM modes seem to perform even better, with a 90% accuracy rate. The FNN model, which is of course the simplest architecture, shows the lowest accuracy above 80% as well. The data shows that CNN is the best model for this malware detection task since it has high accuracy and it trains quickly. The LSTM and Bi-LSTM models prove to have a performance that is the strongest but they are the ones that will take more time to train. The study implies that deep learning cognitive networks can easily discern and classify malicious API calling patterns when marinated well with this complex system. The suggestion is to continue the studies on ensemble models, data augmentation and hyper parameter tuning for the future to improve accuracy and generalization. Through such research, it will become feasible to design and implement a strong and reliable cyber security architecture stand up the most powerful, cutting edge malware threats of today.