blog




  • Essay / Feature Selection Technique in Network Traffic Dataset

    Nowadays, security is a major threat to the digital world. The use of the Internet, computers, mobiles and tablets has become ubiquitous and cyberattacks have grown rapidly. There are different types of cyber attacks such as spoofing, sniffing, denial of service, phishing, evil twins, pharming, click fraud and malware. Malware is harmful to both the computer and the network. The growth of cyberattacks has increased significantly and has compromised systems, taken away valuable information and destroyed important structures, producing huge losses, with each incident costing an average of $345. Say no to plagiarism. Get a tailor-made essay on “Why Violent Video Games Should Not Be Banned”? Get Original EssayNot only the growth of internet uses but also the number of new malware has become another reason for digital threat. More than 317 million new pieces of malware were created in 2014. Conventional antivirus and intrusion detection systems cannot detect zero-day attacks. According to the Symantec Internet Security Threat Report 2010, there are more than 5 million pieces of malware circulating on the Internet. As a result, security specialists are fully dedicated to developing an effective malware detection method. In this work, we describe several feature selection techniques for detecting malware from a network traffic dataset using a machine learning algorithm. Because feature selection is a very important task for malware detection. Malware can be detected using static and dynamic features. Although antivirus software is developed based on malware signature, it fails when a zero-day malware attack occurs. The malware detection system captures the entire network traffic dataset to distinguish between malware and malicious software (normal and suspicious activities). The network traffic dataset contains many packets with huge features. Some features may be very important, but others may not be relevant for decision making. However, this increases processing time and decreases the efficiency of the malware detection system. Therefore, the main objective of the feature selection technique is to reduce the dimensionality of the feature space and remove redundant and irrelevant features from the network traffic dataset. Many approaches have been developed to account for the growing number of malware outbreaks every day. Hansen et al. introduced an approach called Random Forests Classifier to detect and classify the large amount of malware from known or unknown malware family. This approach expressively reduces the feature space. And Cuckoo sandbox is also used as a behavioral trace of analyzed samples due to achieving high malware detection rate and family classification. Tian et al. API call logs were used to distinguish malware from cleanware by examining behavioral characteristics. This work also proposed both the.