Python RandomForestRegressor.setPredictionCol Examples

Programming Language: Python

Namespace/Package Name: pyspark.ml.regression

Method/Function: setPredictionCol

Examples at hotexamples.com: 2

Python RandomForestRegressor.setPredictionCol - 2 examples found. These are the top rated real world Python examples of pyspark.ml.regression.RandomForestRegressor.setPredictionCol extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

RandomForestRegressor(30)

fit(30)

transform(4)

getMaxDepth(4)

getNumTrees(4)

save(2)

setPredictionCol(2)

setLabelCol(2)

predict(2)

load(2)

explainParams(1)

get_params(1)

getPredictionCol(1)

setMaxDepth(1)

setNumTrees(1)

setParams(1)

getMaxBins(1)

set_params(1)

getLabelCol(1)

write(1)

Example #1

Show file

resultsBestDtDf.write.save('/mnt/data/resultsBestDtDf.parquet',
                           format='parquet',
                           header=True,
                           mode="overwrite")

# COMMAND ----------

# COMMAND ----------

from pyspark.ml.regression import RandomForestRegressor

# Create a RandomForestRegressor
rf = RandomForestRegressor()

rf.setPredictionCol("Prediction_cuisine")\
  .setLabelCol("6714")\
  .setFeaturesCol("features")\
  .setSeed(190088121L)\
  .setMaxDepth(8)\
  .setNumTrees(25)

# Create a Pipeline
rfPipeline = Pipeline()

# Set the stages of the Pipeline
rfPipeline.setStages([vectorizer, rf])

# Let's first train on the entire dataset to see what we get
rfModel = rfPipeline.fit(trainingSetDF)

# COMMAND ----------

Example #2

Show file

trainingSetDF = split80DF
testSetDF = split20DF

# Guardamos en cache los datos para agilizar los cáluclos

trainingSetDF.cache()
testSetDF.cache()

# Árboles de decisión

rf = RandomForestRegressor()

# Para información sobre los parametros: print(rf.explainParams())

rf.setPredictionCol('Predicted_PE')\
  .setLabelCol('PE')\
  .setNumTrees(20)\
  .setMaxDepth(5)

# Forest Pipeline

pipeline = Pipeline(stages = [vectorizer, rf])

# Entrenamos el modelo

model = pipeline.fit(trainingSetDF)

# Podemos ver los detalles del árbol creado:

"""
    print("Nodos: " + str(model.stages[-1]._java_obj.parent().getNumTrees()))
    print("Profundidad: "+ str(model.stages[-1]._java_obj.parent().getMaxDepth()))