Ejemplos de Context.union en Python

Lenguaje de programación: Python

Namespace/Package Name: pysparkling

Clase / Tipo: Context

Método / Función: union

Ejemplos en hotexamples.com: 4

Python Context.union - 4 ejemplos encontrados. Estos son los ejemplos en Python del mundo real mejor valorados de pysparkling.Context.union extraídos de proyectos de código abierto. Puedes valorar ejemplos para ayudarnos a mejorar la calidad de los ejemplos.

Métodos usados con frecuencia

Mostrar Ocultar

Context(30)

collect(23)

saveAsTextFile(10)

count(10)

parallelize(9)

map(6)

textFile(3)

mean(2)

foreach(2)

lookup(2)

startswith(2)

takeSample(2)

take(2)

union(2)

first(2)

filter(2)

toLocalIterator(1)

top(1)

pipe(1)

sum(1)

subtract(1)

zip(1)

cartesian(1)

sample(1)

rightOuterJoin(1)

reduceByKey(1)

reduce(1)

countByValue(1)

persist(1)

groupBy(1)

flatMap(1)

flatMapValues(1)

fold(1)

foldByKey(1)

foreachPartition(1)

getNumPartitions(1)

histogram(1)

countByKey(1)

intersection(1)

join(1)

keyBy(1)

leftOuterJoin(1)

cache(1)

mapPartitions(1)

max(1)

zipWithUniqueId(1)

Ejemplo n.º 1

Mostrar archivo

Archivo: apartado2.py Proyecto: SGDI-ucm/Practica1

sc = Context()

file1 = sys.argv[1]
lines = sc.textFile(file1)

rdd_part_1 = (lines.flatMap(lambda x: re.sub("[^\w]", " ", x).split()).map(
    lambda x: (x.lower(), 1)).reduceByKey(lambda x, y: x + y).filter(
        lambda x: x[1] >= 20).map(lambda x: (x[0], (x[1], file1))))

file2 = sys.argv[2]
lines = sc.textFile(file2)

rdd_part_2 = (lines.flatMap(lambda x: re.sub("[^\w]", " ", x).split()).map(
    lambda x: (x.lower(), 1)).reduceByKey(lambda x, y: x + y).filter(
        lambda x: x[1] >= 20).map(lambda x: (x[0], (x[1], file2))))

file3 = sys.argv[3]
lines = sc.textFile(file3)

rdd_part_3 = (lines.flatMap(lambda x: re.sub("[^\w]", " ", x).split()).map(
    lambda x: (x.lower(), 1)).reduceByKey(lambda x, y: x + y).filter(
        lambda x: x[1] >= 20).map(lambda x: (x[0], (x[1], file3))))

rdd_max = sc.union([rdd_part_1, rdd_part_2,
                    rdd_part_3]).groupByKey().sortByKey()

vals = rdd_max.collect()

for item in vals:
    print item

Ejemplo n.º 2

Mostrar archivo

Archivo: test_context_unit.py Proyecto: nicoheidtke/pysparkling

def test_union():
    sc = Context()
    rdd1 = sc.parallelize(["Hello"])
    rdd2 = sc.parallelize(["World"])
    union = sc.union([rdd1, rdd2]).collect()
    assert len(union) == 2 and "Hello" in union and "World" in union

Ejemplo n.º 3

Mostrar archivo

Archivo: test_rdd_unit.py Proyecto: gitter-badger/pysparkling

def test_union():
    my_rdd = Context().parallelize([4, 9, 7, 3, 2, 5], 3)
    assert my_rdd.union(my_rdd).count() == 12

Ejemplo n.º 4

Mostrar archivo

Archivo: test_context_unit.py Proyecto: telamonian/pysparkling

def test_union():
    sc = Context()
    rdd1 = sc.parallelize(['Hello'])
    rdd2 = sc.parallelize(['World'])
    union = sc.union([rdd1, rdd2]).collect()
    assert len(union) == 2 and 'Hello' in union and 'World' in union