Project 5: ML For Security Constructing Evading Network Traf
Project 5 Ml For Security Constructing Evading Network Traffic Bas
The goal of this project is to introduce students to machine learning techniques and methodologies that help to differentiate between malicious and legitimate network traffic. Specifically, students will use a machine learning approach to create a model that learns normal network traffic, and then learn how to blend attack traffic to resemble normal traffic in order to bypass the learned model.
Students are provided with resources including the PAYL model implementation, datasets, and code to train and test models. The project involves training a payload-based intrusion detection system (IDS), tuning parameters to achieve a high true positive rate, and then attempting to evade detection through polymorphic blending techniques. The tasks include training the model, validating attack detection, creating evasive payloads, and testing how well the attack can bypass the model.
Paper For Above instruction
This assignment explores the application of machine learning techniques, specifically payload-based anomaly detection models like PAYL, in network security. It emphasizes the importance of developing robust models capable of accurately distinguishing normal from malicious traffic, and also examines adversarial techniques that malicious actors may use to evade such detection systems.
Payload-based intrusion detection systems (IDS) rely on analyzing byte frequency distributions within network packets to identify anomalies indicative of malicious activity. The PAYL model, as discussed by Wang and Stolfo (2004), utilizes byte frequency statistics and Mahalanobis distance calculations to classify network traffic as normal or anomalous. The machine learning aspect involves training models on normal traffic data, adjusting parameters such as thresholds and smoothing factors to optimize detection rates. Achieving a high true positive rate (TPR) — often above 96% — underscores the model's effectiveness in identifying attacks while minimizing false alarms.
However, attackers continuously develop techniques to bypass detection, including polymorphic blending attacks. These strategies involve transforming attack payloads to resemble normal traffic byte distributions, thereby evading the model's detection capabilities. Fogla et al. (2006) describe such adversarial techniques, emphasizing the importance of understanding how to manipulate byte frequency profiles using substitution tables, encryption, XOR operations, and padding to generate stealthy attack payloads.
This project phases through training, validation, and attack evasion. Initially, students learn how to operate PAYL in training mode, using a dataset of normal traffic to calibrate the model's parameters. The key parameters, the threshold and smoothing factor, are adjusted to maximize true positive detection rates. In subsequent testing, attack payloads are used to verify model effectiveness and refine parameters further. Once a robust model is established, the focus shifts to generating polymorphic attack payloads using substitution tables and padding techniques, based on insights from the relevant literature. These payloads aim to replicate normal byte frequency distributions, thereby bypassing the model.
The challenge involves implementing substitution and padding algorithms to modify attack payloads so that they fit within the normal traffic profile, as evidenced by the model's parameters. Successful evasion not only demonstrates the vulnerabilities of payload-based detection but also underscores the importance of developing more sophisticated models that consider temporal and protocol-specific features.
Throughout the project, students must document their parameter selection, model performance metrics, and the effectiveness of evasive payloads. Proper validation involves confirming that attack payloads are rejected by the model, indicating detection, while evasive payloads successfully bypass detection and are accepted. The culmination of the project includes analyzing the implications of these techniques for real-world network security, emphasizing the need for layered defenses and adaptive machine learning systems to combat evolving threats.
References
- Fogla, P., Sharif, M., Perdisci, R., Kolesnikov, O., & Lee, W. (2006). Polymorphic Blending Attacks. USENIX Security Symposium.
- Wang, K., & Stolfo, S. J. (2004). Anomalous Payload-based Worm Detection and Signature Generation. RAID.
- Denning, D. E. (1987). An intrusion-detection model. IEEE Transactions on Software Engineering, 13(2), 222–232.
- Barford, P., & Yegneswaran, R. (2010). An overview of recent advances in intrusion detection. Computer, 43(4), 36-43.
- Lunt, T. F. (1993). Detecting intruders in computer systems. Technical Report, MITRE Corporation.
- Axelsson, S. (2000). Intrusion detection systems: A survey and taxonomy. Technical report, Department of Computer Engineering, Chalmers University of Technology.
- Song, S., & Paxon, V. (2005). Hidden Markov models for network intrusion detection. Proceedings of IEEE Symposium on Security and Privacy.
- Ning, P., & Kearns, M. (2007). Modeling malware evolution with Markov Chains. USENIX Security Symposium.
- Liao, Y., et al. (2013). Intrusion detection techniques in wireless networks: A review. Journal of Network and Computer Applications, 36(1), 16–24.
- Sommer, R., & Paxson, V. (2010). Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. IEEE Symposium on Security and Privacy.