Project Title: Classification of Firewall Log Files using Supervised Machine Learning Methods
Team Members
Dr. G. Padmavathi, Dean - PSCS, Professor, Department of Computer Science
Dr. S. N. Geethalakshmi, Professor, Department of Computer Science
Ms. A. Roshni, Research Assistant, Centre for Cyber Intelligence, DST - CURIE - AI
Ms. P. Sri Dhanalakshmi, M.Sc Computer Science
Project Summary
A firewall retains traffic entering and departing the domain it was supposed to protect. Some firewalls may provide information about the source and type of traffic entering the environment. A firewall's policy must be enhanced with a successful logging capability in order to be successful. The logging feature keeps track of how the firewall handles different sorts of traffic. Organizations can use the logs to find out things like Source IP addresses and destination IP addresses, protocols, and port numbers. Monitoring and analyzing log files can assist IT businesses improve the end-user reliability of their systems. Log files may consists of malicious texts, strings that tricks the users to hack the information. In generation of number of firewall logs per day, classifying the log files may help to observe more efficient, the number of unnecessary attributes can be minimized with the help of classification, resulting in a more efficient performance. The project title is ‘Classification of firewall log files using supervised machine learning methods’, the main intent of this project is to analyze and classify firewall logs which may consists of source port, destination port, bytes sent and received, etc., It checks that each data packet arrives on both sides of the firewall, it then decides whether or not to pass it. Firewalls can improve security even more by allowing quite well control over which system functions and processes have access to networking resources. The process starts with data collection followed by pre-processing techniques and main features to be selected to build a framework using supervised machine learning algorithms. In classification problems, the selection of appropriate and relevant dataset features plays a critical role. The feature selection approaches to improve the accuracy of classification system using Weka tool. Different classification techniques like Support Vector Machine, Naïve Bayes, Logistic Regression and K-Nearest Neighbor were adopted and their performance were analyzed.