Esempi in Python per SparkHolder

Linguaggio di programmazione: Python

Spazio dei nomi/nome del pacchetto: intake_spark.base

Classe/tipologia: SparkHolder

Esempi su hotexamples.com: 5

SparkHolder in Python: 5 esempi trovati. Questi sono i migliori esempi reali in Python per intake_spark.base.SparkHolder, estratti da progetti open source. Li puoi valutare, per aiutarci a migliorare la qualità dei nostri esempi.

Metodi utilizzati di frequente

Mostra Nascondi

SparkHolder(5)

setup(5)

Metodi utilizzati di frequente

SparkHolder (5)

setup (5)

Esempio n. 1

Mostra file

    def to_spark(self):
        """Pass URL to spark to load as a DataFrame

        Note that this requires ``org.apache.spark.sql.avro.AvroFileFormat``
        to be installed in your spark classes.

        This feature is experimental.
        """
        from intake_spark.base import SparkHolder
        sh = SparkHolder(True,
                         [['read'], ['format', ["com.databricks.spark.avro"]],
                          ['load', [self._urlpath]]], {})
        return sh.setup()

Esempio n. 2

Mostra file

File: source.py Progetto: zillow/intake-parquet

    def to_spark(self):
        """Produce Spark DataFrame equivalent

        This will ignore all arguments except the urlpath, which will be
        directly interpreted by Spark. If you need to configure the storage,
        that must be done on the spark side.

        This method requires intake-spark. See its documentation for how to
        set up a spark Session.
        """
        from intake_spark.base import SparkHolder
        args = [['read'], ['parquet', [self._urlpath]]]
        sh = SparkHolder(True, args, {})
        return sh.setup()

Esempio n. 3

Mostra file

File: test_spark.py Progetto: yuhonghong7035/intake-spark

def test_cat():
    import pyspark
    h = SparkHolder(True, [('catalog', )], {})
    h.setup()  # create spark session early
    session = h.session[0]
    d = session.createDataFrame(df)
    sql = pyspark.HiveContext(session.sparkContext)
    sql.registerDataFrameAsTable(d, 'temp')

    cat = SparkTablesCatalog()
    assert 'temp' in list(cat)
    s = cat.temp()
    assert isinstance(s, SparkDataFrame)
    out = s.read()
    assert out.astype(df.dtypes).equals(df)

Esempio n. 4

Mostra file

 def to_spark(self):
     from intake_spark.base import SparkHolder
     h = SparkHolder(False, [('textFile', (self._urlpath, ))], {})
     return h.setup()

Esempio n. 5

Mostra file

 def to_spark(self):
     from intake_spark.base import SparkHolder
     h = SparkHolder(True, [('read', ), ('format', ("csv", )),
                            ('option', ("header", "true")),
                            ('load', (self.urlpath, ))], {})
     return h.setup()