Esempi in Python per StreamingLogisticRegressionWithSGD.predictOn

Linguaggio di programmazione: Python

Spazio dei nomi/nome del pacchetto: pyspark.mllib.classification

Classe/tipologia: StreamingLogisticRegressionWithSGD

Metodo/funzione: predictOn

Esempi su hotexamples.com: 1

StreamingLogisticRegressionWithSGD.predictOn in Python: 1 esempio trovato. Questo è il miglior esempio reale in Python per pyspark.mllib.classification.StreamingLogisticRegressionWithSGD.predictOn, estratto da progetti open source. Lo puoi valutare, per aiutarci a migliorare la qualità dei nostri esempi.

Metodi utilizzati di frequente

Mostra Nascondi

StreamingLogisticRegressionWithSGD(11)

setInitialWeights(9)

trainOn(7)

predictOnValues(5)

latestModel(3)

predictOn(1)

Esempio n. 1

Mostra file

File: solution_tp_streaming.py Progetto: pauldechorgnat/sandbox

    # getting only the useful information
    dfs = kafkaStream.map(lambda stream: stream[1].encode('utf-8'))
    # parsing data into a dictionary
    dfs = dfs.map(lambda raw: ast.literal_eval(raw))
    # filtering data on tweets that are not in English
    dfs = dfs.filter(lambda dictionary: dictionary.get('lang', '') == 'en')
    # filtering data on tweets that contain text
    # this part is a security against empty data
    dfs = dfs.filter(lambda dictionary: dictionary.get('text', '') != '')
    # changing words into tokens
    dfs = dfs.map(lambda dictionary: tokenize(text=dictionary.get('text', ''), common_words=common_words))
    # changing tokens into terms frequency sparse vectors
    dfs = dfs.map(lambda tokens: compute_tf(tokens, reference_table=reference_table))

    # making predictions using the logistic regression
    dfs_predictions = lr.predictOn(dfs)
    # preparing data to count positive and negative predictions
    dfs_predictions = dfs_predictions.map(lambda x: (x, 1))
    # computing positive and negative predictions counts
    dfs_predictions = dfs_predictions.reduceByKey(lambda x, y: x+y)

    # printing the results in the console
    dfs_predictions.pprint()

    # saving data into HBase
    dfs_predictions.foreachRDD(put_data_into_hbase)

    # starting streaming
    ssc.start()
    ssc.awaitTermination()