Python RandomForestClassifier.explainParams 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: pyspark.ml.classification

메소드/함수: explainParams

hotexamples.com에서의 예제들: 4

Python RandomForestClassifier.explainParams - 4개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 pyspark.ml.classification.RandomForestClassifier.explainParams에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

RandomForestClassifier(30)

fit(30)

transform(10)

save(7)

load(6)

explainParams(4)

write(2)

getLabelCol(1)

getPredictionCol(1)

predict_proba(1)

setLabelCol(1)

예제 #1

파일 보기

# COMMAND ----------

summary.objectiveHistory

# COMMAND ----------

from pyspark.ml.classification import DecisionTreeClassifier
dt = DecisionTreeClassifier()
print dt.explainParams()
dtModel = dt.fit(bInput)

# COMMAND ----------

from pyspark.ml.classification import RandomForestClassifier
rfClassifier = RandomForestClassifier()
print rfClassifier.explainParams()
trainedModel = rfClassifier.fit(bInput)

# COMMAND ----------

from pyspark.ml.classification import GBTClassifier
gbtClassifier = GBTClassifier()
print gbtClassifier.explainParams()
trainedModel = gbtClassifier.fit(bInput)

# COMMAND ----------

from pyspark.ml.classification import NaiveBayes
nb = NaiveBayes()
print nb.explainParams()
trainedModel = nb.fit(bInput.where("label != 0"))

예제 #2

파일 보기

# Do some checking on the new DataFrame, see if they look ok.
df_train.select("V1","V2","Features","Class").show(10)
df_test.select("V1","V2","Features","Class").show(10)

# Get some stats on the datasets
df_train.describe("V1","Class").show()
df_test.describe("V1","Class").show()


# ## Specify Random Forest model

from pyspark.ml.classification import RandomForestClassifier
rf = RandomForestClassifier(featuresCol="Features", labelCol="Class", numTrees=10)

# Use the `explainParams` method to get a full list of parameters:
print(rf.explainParams())

# ## Fit the Random Forest model

# Use the `fit` method to fit the linear regression model on the train DataFrame:
%time rf_model = rf.fit(df_train)

# The result is an instance of the
# [LogisticRegressionModel](http://spark.apache.org/docs/latest/api/python/pyspark.ml.html#pyspark.ml.classification.LogisticRegressionModel)
# class:
type(rf_model)


# ## Evaluate model performance on the test dataset.

# Use the `evaluate` method of the

예제 #3

파일 보기

파일: 20_pipeline.py 프로젝트: marcioshochi/data-scientist-trainning

from pyspark.ml.feature import StringIndexer
indexer = StringIndexer(inputCol="vehicle_color", outputCol="vehicle_color_indexed")

# Create dummy variables for `vehicle_color_indexed`:
from pyspark.ml.feature import OneHotEncoder
encoder = OneHotEncoder(inputCol="vehicle_color_indexed", outputCol="vehicle_color_encoded")

# Select and assemble the features:
from pyspark.ml.feature import VectorAssembler
features = ["reviewed", "vehicle_year", "vehicle_color_encoded", "CloudCover"]
assembler = VectorAssembler(inputCols=features, outputCol="features")

# Specify the estimator (i.e., classification algorithm):
from pyspark.ml.classification import RandomForestClassifier
classifier = RandomForestClassifier(featuresCol="features", labelCol="star_rating")
print(classifier.explainParams())

# Specify the hyperparameter grid:
from pyspark.ml.tuning import ParamGridBuilder
maxDepthList = [5, 10, 20]
numTreesList = [20, 50, 100]
subsamplingRateList = [0.5, 1.0]
paramGrid = ParamGridBuilder() \
  .addGrid(classifier.maxDepth, maxDepthList) \
  .addGrid(classifier.numTrees, numTreesList) \
  .addGrid(classifier.subsamplingRate, subsamplingRateList) \
  .build()

# Specify the evaluator:
from pyspark.ml.evaluation import MulticlassClassificationEvaluator
evaluator = MulticlassClassificationEvaluator(labelCol="star_rating", metricName="accuracy")

예제 #4

파일 보기

파일: STDG_Ch_26_Py.py 프로젝트: neeraj-somani/SparkTheDefinitiveGuideBook

# COMMAND ----------

summary.objectiveHistory

# COMMAND ----------

from pyspark.ml.classification import DecisionTreeClassifier
dt = DecisionTreeClassifier()
print(dt.explainParams())
dtModel = dt.fit(bInput)

# COMMAND ----------

from pyspark.ml.classification import RandomForestClassifier
rfClassifier = RandomForestClassifier()
print(rfClassifier.explainParams())
trainedModel = rfClassifier.fit(bInput)

# COMMAND ----------

from pyspark.ml.classification import GBTClassifier
gbtClassifier = GBTClassifier()
print(gbtClassifier.explainParams())
trainedModel = gbtClassifier.fit(bInput)

# COMMAND ----------

from pyspark.ml.classification import NaiveBayes
nb = NaiveBayes()
print(nb.explainParams())
trainedModel = nb.fit(bInput.where("label != 0"))