Feature Selection Techniques

Feature Selection is the process of choosing or rejecting features for Machine Learning training by testing feature attributes to determine their value. This techniques are used to perform Dimensionality Reduction and deal with the Curse of Dimensionality.

  • Low Variance Filter: disregarding attributes with a very low variance after comparing the variance in the dataset’s distribution of all the features.
  • High Correlation Filter: By finding pair-wise correlation between attributes, extra attributes are removed.
  • Feature Ranking: First rank features according to their significance or contribution to the model’s predictability using Decision Trees, such as Classification and Regression Tree (CART). then remove features with lower rank.
  • Multicollinearity solutions.
  • Recursive Feature Elimination (RFE)
  • Factor analysis describes variability using the correlated and observed variables, and models a smaller number of unobserved, latent variables known as factors.
  • Independent Component Analysis (ICA)
  • Principal Components Analysis (PCA)
  • L1 Regularization: By keeping only features with non-zero coefficients, it can be used for Feature Selection. However L2 regularization can't be used for feature selection because it doesn't make Weights zero, but close to zero.
  • Neighborhood Component Analysis: It's a Supervised Learning and non-parametric method for selecting features with the goal of maximizing prediction accuracy of Classification and Regression Tasks.
  • Relief Algorithm: It calculates a feature score for each feature, then ranks them based on the score to select top scoring features. It's sensitive to feature interactions.