Member-only story
Filter Based Feature Selection Techniques
When there are a lot of predictors (or features), it is often best to pick the most important ones and get rid of the rest. There are many ways to do this, and most of them can be put into two groups: techniques for selecting features and techniques for reducing the number of dimensions. Dimension reduction techniques try to project the data into lower dimensions in order to keep only the information in the data that is needed for the analysis. On the other hand, feature selection techniques try to show statistically why one predictor should be used over another.

We will only look at the best filter-based feature selection techniques in this article. Even though there is a lot to learn about feature selection, we will only talk about the top four most important filter-based feature selection techniques in this article so that it doesn’t get too long. Also, as a running example, we will be testing the performance of these methods on the Breast Cancer Wisconsin (Diagnostic) dataset available at the UCI Machine Learning Repository.
Feature Selection
Feature selection is an important phase in the machine learning process since it improves model performance and reduces model complexity. The primary goal of feature selection is to discover a subset of relevant features that contribute the most to prediction accuracy…