variance thresholding
It supposes that features with low variance contain less information. By default, it removes all zero variance features.
Practically implementation in Python with scikit learn library.
#Import Libraires
from sklearn import datasets
from sklearn.feature_selection import VarianceThreshold
#Load iris data
df = datasets.load_iris()
#Create features and target
X = df.data
y = df.target
#Create VarianceThreshold object with threshold of 0.6
threshold = VarianceThreshold(threshold=.6)
Conduct variance thresholding
x_high_var = threshold.fit_transform(X)
#view first 10 records with variance above threshold
x_high_var[0:10]
#Ouput
array([[5.1, 1.4],
[4.9, 1.4],
[4.7, 1.3],
[4.6, 1.5],
[5. , 1.4],
[5.4, 1.7],
[4.6, 1.4],
[5. , 1.5],
[4.4, 1.4],
[4.9, 1.5]])
For more about it. You can read officially scikit learn page.
Leave a Reply
You must be logged in to post a comment.