Python EMRJobRunner.get_s3_keysの例

プログラミング言語: Python

名前空間/パッケージ名: mrjob.emr

クラス/型: EMRJobRunner

メソッド/関数: get_s3_keys

hotexamples.comのコード掲載数: 4

Python EMRJobRunner.get_s3_keys - 4件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのmrjob.emr.EMRJobRunner.get_s3_keysの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

EMRJobRunner(19)

make_emr_conn(13)

make_s3_conn(7)

cleanup(6)

make_persistent_job_flow(5)

make_emr_client(5)

run(4)

describe_cluster(3)

get_s3_keys(2)

_create_master_bootstrap_script(2)

simulate_progress(2)

path_exists(2)

make_persistent_cluster(2)

run_jobflow(1)

get_contents_as_string(1)

ls(1)

_create_dir_archive(1)

describe_jobflows(1)

describe_jobflow(1)

_s3_ls(1)

_pick_error(1)

_logs_needed_to_pick_error(1)

_find_probable_cause_of_failure(1)

_enable_slave_ssh_access(1)

_dir_archive_path(1)

usable_job_flows(1)

コード例 #1

ファイルを表示

ファイル: mr_tf.py プロジェクト: jeffchan/asciiclass

 def reducer_init(self):
     emr = EMRJobRunner(aws_access_key_id=AWS_ACCESS_KEY, aws_secret_access_key=AWS_SECRET_KEY)
     idf_parts = emr.get_s3_keys('s3://6885public/jeffchan/term-idfs/')
     self.word_to_idf = dict()
     for part in idf_parts:
         json = part.get_contents_as_string()
         for line in StringIO.StringIO(json):
             pair = json.loads(line)
             self.word_to_idf[pair['term']] = pair['idf']

コード例 #2

ファイルを表示

ファイル: mr_tf.py プロジェクト: jeffchan/asciiclass

 def reducer_init(self):
     emr = EMRJobRunner(aws_access_key_id=AWS_ACCESS_KEY,
                        aws_secret_access_key=AWS_SECRET_KEY)
     idf_parts = emr.get_s3_keys('s3://6885public/jeffchan/term-idfs/')
     self.word_to_idf = dict()
     for part in idf_parts:
         json = part.get_contents_as_string()
         for line in StringIO.StringIO(json):
             pair = json.loads(line)
             self.word_to_idf[pair['term']] = pair['idf']

コード例 #3

ファイルを表示

ファイル: mr_tfidf_per_sender_aws.py プロジェクト: myw/dataiap

    def reducer_init(self):
        self.idfs = {}

        # Iterate through the files in the bucket provided by the user
        if self.options.aws_access_key_id and self.options.aws_secret_access_key:
            emr = EMRJobRunner(aws_access_key_id=self.options.aws_access_key_id,
                               aws_secret_access_key=self.options.aws_secret_access_key)
        else:
            emr = EMRJobRunner()

        for key in emr.get_s3_keys("s3://" + self.options.idf_loc):
            # Load the whole file first, then read it line-by-line: otherwise,
            # chunks may not be even lines
            for line in StringIO(key.get_contents_as_string()): 
                term_idf = JSONValueProtocol.read(line)[1] # parse the line as a JSON object
                self.idfs[term_idf['term']] = term_idf['idf']

コード例 #4

ファイルを表示

ファイル: mr_tfidf_per_sender_aws.py プロジェクト: myw/dataiap

    def reducer_init(self):
        self.idfs = {}

        # Iterate through the files in the bucket provided by the user
        if self.options.aws_access_key_id and self.options.aws_secret_access_key:
            emr = EMRJobRunner(
                aws_access_key_id=self.options.aws_access_key_id,
                aws_secret_access_key=self.options.aws_secret_access_key)
        else:
            emr = EMRJobRunner()

        for key in emr.get_s3_keys("s3://" + self.options.idf_loc):
            # Load the whole file first, then read it line-by-line: otherwise,
            # chunks may not be even lines
            for line in StringIO(key.get_contents_as_string()):
                term_idf = JSONValueProtocol.read(line)[
                    1]  # parse the line as a JSON object
                self.idfs[term_idf['term']] = term_idf['idf']