A Hybrid Approach for Feature Extraction of Multi-Class Dataset in IDS

Jhansi Rani Mettu

doi:10.52710/cfs.84

pdf

Published: Dec 21, 2024

DOI: https://doi.org/10.52710/cfs.84

Keywords:

Intrusion Detection System (IDS), Hybrid Feature Extraction, K-Best, Random Forest Importance, Multi-Class Classification

Jhansi Rani Mettu, Dhanpratap Singh

Abstract

Intrusion Detection Systems (IDS) are very important for keeping networks safe from online dangers, especially when there are a lot of different classes of data and duplicate features that can slow things down. This work presents a new way to improve the performance of IDS by mixing the K-Best and Random Forest Importance methods for feature extraction. Before Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Independent Component Analysis (ICA), and the suggested blend method were used on the IoT-23 dataset, features were normalized and labelled. The combination method picked out important factors like flow time, packet length, protocol type, and response numbers, which led to better classification results. We used an 80-20 split for train and test to check how well three models (XGBoost, Random Forest, and Naive Bayes) worked. Comparative research showed that the combined method was better, as it achieved 99% accuracy and big gains in precision, memory, and F1-score measures. In particular, XGBoost proved to be the best model, showing impressive speed with its mixed feature set. PCA, LDA, and ICA, on the other hand, gave average results. This shows how important it is to combine different feature selection methods. The results show that the mix method can deal with feature duplication and improve IDS performance, which makes it a good choice for real-world use. To make this method even better, more study could look into how it works with bigger datasets and more models.

Issue

Volume 2024, Issue 8

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details