The SelectPercentile function in scikit-learn's feature_selection module is used to select a specified percentage of the top features from a given data set based on statistical testing. This function is commonly used in machine learning pipelines to reduce feature dimensionality and improve model performance.
Example usage of SelectPercentile function:
from sklearn.feature_selection import SelectPercentile from sklearn.datasets import load_iris from sklearn.feature_selection import chi2
# load iris dataset X, y = load_iris(return_X_y=True)
# select top 20% features based on chi-square test selector = SelectPercentile(score_func=chi2, percentile=20).fit(X, y)
# transform original feature matrix to selected feature matrix X_selected = selector.transform(X)
The code above demonstrates how the SelectPercentile function can be used to select the top 20% features from the iris dataset based on the chi-square test. The resulting feature matrix X_selected contains only the selected features.
The SelectPercentile function is part of the scikit-learn package.
Python SelectPercentile.fit - 60 examples found. These are the top rated real world Python examples of sklearn.feature_selection.SelectPercentile.fit extracted from open source projects. You can rate examples to help us improve the quality of examples.