Python SparkContext.close Beispiele

Programmiersprache: Python

Namespace / Paketname: pyspark

Klasse / Typ: SparkContext

Methode / Funktion: close

Beispiele auf hotexamples.com: 2

Python SparkContext.close - 2 Beispiele gefunden. Dies sind die am besten bewerteten Python Beispiele für die pyspark.SparkContext.close, die aus Open Source-Projekten extrahiert wurden. Sie können Beispiele bewerten, um die Qualität der Beispiele zu verbessern.

Häufig verwendete Methoden

Anzeigen Verbergen

setLogLevel(30)

setSystemProperty(30)

setCheckpointDir(30)

getConf(30)

parallelize(30)

pickleFile(30)

broadcast(30)

emptyRDD(30)

newAPIHadoopFile(30)

binaryFiles(30)

addPyFile(30)

addFile(30)

accumulator(30)

getOrCreate(30)

SparkContext(30)

sequenceFile(30)

newAPIHadoopRDD(25)

_ensure_initialized(14)

createDataFrame(11)

hadoopFile(10)

show_profiles(9)

range(8)

dump_profiles(6)

mongoRDD(6)

binaryRecords(6)

map(4)

setLocalProperty(3)

runJob(3)

flatMap(2)

cassandraTable(2)

collect(2)

close(2)

setJobGroup(2)

paralellize(1)

neo4jTable(1)

neo4jConfig(1)

parallelise(1)

BSONFileRDD(1)

parallelized(1)

parallize(1)

reduceByKey(1)

sample(1)

mongoPairRDD(1)

setMaster(1)

show_profile(1)

sortBy(1)

saveAsTextFile(1)

hadoopConfiguration(1)

mixin(1)

filter(1)

Beispiel #1

Datei anzeigen

    def main(self):

        # loading configuration parameters (from a config file when working on a project)
        zk, topic, app_name, batch_duration, master = self.setConfiguration()

        # initiate the spark context / streaming context
        conf = (SparkConf().setMaster(master))
        sc = SparkContext(appName=app_name, conf=conf)
        ssc = StreamingContext(sc, batch_duration)

        # reading data to kafka
        kvs = KafkaUtils.createStream(ssc, zk, "spark-streaming-consumer",
                                      {topic: 1})
        lines = kvs.map(lambda x: x[1])

        lines.pprint()

        ssc.start()  # Start the computation
        ssc.awaitTermination()  # Wait for the computation to terminate
        sc.close()

Beispiel #2

Datei anzeigen

Datei: aggregate_by_key.py Projekt: pwachira/coursera_dataengineering

from pyspark import SparkContext, SparkConf
import operator
import os

os.chdir()
#read in a local file
sc = SparkContext(conf=SparkConf().setAppName('App').setMaster('local'))
raw_data = sc.textFile('/data/twitter/twitter_sample_small.txt')


#define a method to read the data, split by tab
def parse_edge(s):
    user, follower = s.split('\t')
    return (int(user), int(follower))


# cache the intermediate rdd after parsing it
edges = raw_data.map(parse_edge).cache()

#apply aggregateByKey - see explanation below the code
fol_agg = edges.aggregateByKey(0,lambda v1,v2: v1+1 \
                               ,operator.add)

# top user/key with most followers.
# use operator to make sure the values(aggregated counts) and not the keys/userIds
#  are used for the comparison
top_user = fol_agg.top(1, operator.itemgetter(1))
print '%d %d' % (top_user[0][0], top_user[0][1])
sc.close()