Python Dataframe.from_schema_rdd 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: sparklingpandas.dataframe

클래스/타입: Dataframe

메소드/함수: from_schema_rdd

hotexamples.com에서의 예제들: 2

Python Dataframe.from_schema_rdd - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 sparklingpandas.dataframe.Dataframe.from_schema_rdd에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

fromDataFrameRDD(10)

from_spark_rdd(4)

from_schema_rdd(2)

예제 #1

파일 보기

파일: groupby.py 프로젝트: asaf-erlich/sparklingpandas

    def _use_aggregation(self, agg, columns=None):
        """Compute the result using the aggregation function provided.
        The aggregation name must also be provided so we can strip of the extra
        name that Spark SQL adds."""
        if not columns:
            columns = self._columns
        from pyspark.sql import functions as F

        aggs = map(lambda column: agg(column).alias(column), self._columns)
        aggRdd = self._grouped_spark_sql.agg(*aggs)
        df = Dataframe.from_schema_rdd(aggRdd, self._by)
        return df

예제 #2

파일 보기

파일: pcontext.py 프로젝트: asaf-erlich/sparklingpandas

 def from_pd_data_frame(self, local_df):
     """Make a distributed dataframe from a local dataframe. The intend use
     is for testing. Note: dtypes are re-infered, so they may not match."""
     def frame_to_rows(frame):
         """Convert a Panda's DataFrame into Spark SQL Rows"""
         # TODO: Convert to row objects directly?
         return [r.tolist() for r in frame.to_records()]
     schema = list(local_df.columns)
     index_names = list(local_df.index.names)
     index_names = _normalize_index_names(index_names)
     schema = index_names + schema
     rows = self.spark_ctx.parallelize(frame_to_rows(local_df))
     sp_df = Dataframe.from_schema_rdd(
         self.sql_ctx.createDataFrame(
             rows,
             schema=schema,
             # Look at all the rows, should be ok since coming from
             # a local dataset
             samplingRatio=1))
     sp_df._index_names = index_names
     return sp_df