Feature Selection

Feature selection is a process of selection of a subset of relevant Features or variables from the set of all features(dataset), used in the process of model building. By removing irrelevant features in data, the accuracy of many models will improve greatly. Therefore feature selection is especially important in predictive Linear Regression.


  • Improved accuracy
  • Reduction in Model Overfitting
  • Reduced model size and computation costs
  • Improved interpretability and understandability of the model

Types of Feature Selection Techniques:

Measure of Impurity

In regards to Classification models specially Decision Trees, the more a feature decreases the impurity, the more important the feature is

Feature Selection Techniques

Feature Selection is the process of choosing or rejecting features for Machine Learning training by testing feature attributes to determine their value. This techniques are used to perform Dimensionality Reduction and deal with the Curse of Dimensionality.

  • Low Variance Filter: disregarding attributes with a very low variance after comparing the variance in the dataset’s distribution of all the features.
  • High Correlation Filter: By finding pair-wise correlation between attributes, extra attributes are removed.
  • Feature Ranking: First rank features according to their significance or contribution to the model’s predictability using Decision Trees, such as Classification and Regression Tree (CART). then remove features with lower rank.
  • Multicollinearity solutions.
  • Recursive Feature Elimination (RFE)
  • Factor analysis describes variability using the correlated and observed variables, and models a smaller number of unobserved, latent variables known as factors.
  • Independent Component Analysis (ICA)
  • Principal Components Analysis (PCA)
  • L1 Regularization: By keeping only features with non-zero coefficients, it can be used for Feature Selection. However L2 regularization can't be used for feature selection because it doesn't make Weights zero, but close to zero.
  • Neighborhood Component Analysis: It's a Supervised Learning and non-parametric method for selecting features with the goal of maximizing prediction accuracy of Classification and Regression Tasks.
  • Relief Algorithm: It calculates a feature score for each feature, then ranks them based on the score to select top scoring features. It's sensitive to feature interactions.
  • Feature selection simply selects and excludes given characteristic features without excluding them. It includes and excludes the characteristic attributes in the data without changing them. Feature selection is primarily focused on removing non-informative or redundant predictors from the model.
  • Dimensionality reduction transforms the features into a lower dimension. It reduces the number of attributes by creating new combinations of attributes.