Python Babe.filterColumnsの例

プログラミング言語: Python

名前空間/パッケージ名: pybabe

クラス/型: Babe

メソッド/関数: filterColumns

hotexamples.comのコード掲載数: 7

Python Babe.filterColumns - 7件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのpybabe.Babe.filterColumnsの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Babe(30)

push(30)

to_string(30)

pull(24)

typedetect(9)

mapTo(6)

head(6)

push_sql(4)

join(4)

primary_key_detect(4)

dedup(4)

partition(4)

push_mongo(3)

filterColumns(3)

maxN(3)

rename(2)

push_bigquery(2)

replace_in_string(2)

sort(2)

user_agent(2)

minN(2)

parse_time(2)

filter(2)

filter_values(2)

flatMap(2)

geoip_country_code(2)

groupBy(2)

group(1)

to_list(1)

tee(1)

get_config_with_env(1)

groupAll(1)

bulkMapTo(1)

group_all(1)

has_config(1)

keynormalize(1)

pull_kontagent(1)

pull_command(1)

mail(1)

merge_substreams(1)

windowMap(1)

コード例 #1

ファイルを表示

 def test_filter2(self):
     a = Babe().pull(stream=StringIO('a,b\n1,2\n3,4\n1,4\n'),
                     format="csv").typedetect()
     a = a.filterColumns(remove_fields=['a'])
     buf = StringIO()
     a.push(stream=buf, format="csv")
     self.assertEquals(buf.getvalue(), "b\n2\n4\n4\n")

コード例 #2

ファイルを表示

ファイル: tests.py プロジェクト: nizox/PyBabe

 def test_twitter(self):
     a = Babe().pull_twitter()
     a = a.filterColumns(keep_fields=
     ["author_name", "author_id", "author_screen_name", "created_at", "hashtags", "text", "in_reply_to_status_id_str"])
     a = a.typedetect()
     buf = StringIO()
     a.push(stream=buf, format='csv')

コード例 #3

ファイルを表示

 def test_twitter(self):
     a = Babe().pull_twitter()
     a = a.filterColumns(keep_fields=[
         "author_name", "author_id", "author_screen_name", "created_at",
         "hashtags", "text", "in_reply_to_status_id_str"
     ])
     a = a.typedetect()
     a.to_string()

コード例 #4

ファイルを表示

 def test_filter2(self):
     a = Babe().pull(stream=StringIO('a,b\n1,2\n3,4\n1,4\n'),
                     format="csv").typedetect()
     a = a.filterColumns(remove_fields=['a'])
     self.assertEquals(a.to_string(), "b\n2\n4\n4\n")

コード例 #5

ファイルを表示

ファイル: test_transform.py プロジェクト: fdouetteau/PyBabe

 def test_filter2(self):
      a = Babe().pull(stream=StringIO('a,b\n1,2\n3,4\n1,4\n'), format="csv").typedetect()
      a = a.filterColumns(remove_fields=['a'])
      self.assertEquals(a.to_string(), "b\n2\n4\n4\n")

コード例 #6

ファイルを表示

ファイル: test_kontagent_to_bigquery.py プロジェクト: IsCoolEntertainment/PyBabe

    def test_gs_load_from_kontagent(self):
        # export 1 full day
        bucket = 'bertrandtest'
        game = 'wordox'
        day = '20151021'
        hour = '14'
        table_name = '{}_{}'.format(game, day)
        filename = '{}.csv'.format(table_name + hour)
        result = time.strptime(day + ' ' + hour, '%Y%m%d %H')
        start_time = datetime(result.tm_year,
                              result.tm_mon,
                              result.tm_mday,
                              result.tm_hour)
        end_time = start_time + timedelta(hours=1)

        a = Babe()
        a = a.pull_kontagent(start_time=start_time,
                             sample_mode=False,
                             end_time=end_time,
                             KT_APPID='869fb4a24faa4c61b702ea137cbe16ad',
                             discard_names=["PointSend"])
        a = a.mapTo(decode_data, insert_fields=["decoded_data"])
        a = a.filterColumns(keep_fields=v1)
        a = a.filter(lambda row: uid_type_check(row) is True)
        a.push(filename=filename,
               format='csv',
               delimiter='\t',
               quotechar='|',
               encoding='utf8',
               bucket=bucket,
               protocol='gs')

        a.push_bigquery(filename=filename,
                        bucket=bucket,
                        project_id='bigquery-testing-1098',
                        dataset_id='ladata',
                        table_name=table_name,
                        schema=[
                            {
                                "name": "date",
                                "type": "STRING",
                                "mode": "REQUIRED"
                            },
                            {
                                "name": "hour",
                                "type": "INTEGER",
                                "mode": "REQUIRED"
                            },
                            {
                                "name": "time",
                                "type": "TIMESTAMP",
                                "mode": "REQUIRED"
                            },
                            {
                                "name": "name",
                                "type": "STRING",
                                "mode": "REQUIRED"
                            },
                            {
                                "name": "uid",
                                "type": "INTEGER"
                            },
                            {
                                "name": "st1",
                                "type": "STRING"
                            },
                            {
                                "name": "st2",
                                "type": "STRING"
                            },
                            {
                                "name": "st3",
                                "type": "STRING"
                            },
                            {
                                "name": "channel_type",
                                "type": "STRING"
                            },
                            {
                                "name": "value",
                                "type": "INTEGER"
                            },
                            {
                                "name": "level",
                                "type": "INTEGER"
                            },
                            {
                                "name": "recipients",
                                "type": "STRING"
                            },
                            {
                                "name": "tracking_data",
                                "type": "STRING"
                            },
                            {
                                "name": "data",
                                "type": "STRING"
                            }
                        ],
                        job_id='{}_{}'.format(start_time, end_time),
                        num_retries=5)

コード例 #7

ファイルを表示

ファイル: tests.py プロジェクト: nizox/PyBabe

 def test_filter2(self):
      a = Babe().pull(stream=StringIO('a,b\n1,2\n3,4\n1,4\n'), format="csv").typedetect()
      a = a.filterColumns(remove_fields=['a'])
      buf = StringIO()
      a.push(stream=buf, format="csv")
      self.assertEquals(buf.getvalue(), "b\n2\n4\n4\n")