Python Table.to_batches 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: pyarrow

클래스/타입: Table

메소드/함수: to_batches

hotexamples.com에서의 예제들: 6

Python Table.to_batches - 6개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 pyarrow.Table.to_batches에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

from_pandas(18)

to_pandas(18)

column(16)

append_column(10)

slice(9)

itercolumns(8)

from_pydict(7)

set_column(6)

to_batches(6)

drop(5)

take(5)

to_pydict(5)

filter(4)

field(4)

rename_columns(4)

from_arrays(3)

from_batches(3)

cast(3)

replace_schema_metadata(3)

combine_chunks(2)

select(1)

예제 #1

파일 보기

    def write_table(self,
                    pa_table: pa.Table,
                    writer_batch_size: Optional[int] = None):
        """ Write a batch of Example to file.

        Args:
            example: the Example to add.
        """
        if writer_batch_size is None:
            writer_batch_size = self.writer_batch_size
        if self.pa_writer is None:
            self._build_writer(inferred_schema=pa_table.schema)
        batches: List[pa.RecordBatch] = pa_table.to_batches(
            max_chunksize=writer_batch_size)
        self._num_bytes += sum(batch.nbytes for batch in batches)
        self._num_examples += pa_table.num_rows
        for batch in batches:
            self.pa_writer.write_batch(batch)

예제 #2

파일 보기

파일: arrow_writer.py 프로젝트: aiinnova/huggingface_datasets

    def write_table(self, pa_table: pa.Table, writer_batch_size: Optional[int] = None):
        """Write a Table to file.

        Args:
            example: the Table to add.
        """
        if writer_batch_size is None:
            writer_batch_size = self.writer_batch_size
        if self.pa_writer is None:
            self._build_writer(inferred_schema=pa_table.schema)
        # reorder the arrays if necessary + cast to self._schema
        # we can't simply use .cast here because we may need to change the order of the columns
        pa_table = pa.Table.from_arrays([pa_table[name] for name in self._schema.names], schema=self._schema)
        batches: List[pa.RecordBatch] = pa_table.to_batches(max_chunksize=writer_batch_size)
        self._num_bytes += sum(batch.nbytes for batch in batches)
        self._num_examples += pa_table.num_rows
        for batch in batches:
            self.pa_writer.write_batch(batch)

예제 #3

파일 보기

 def _TableToRecordBatch(
         self,
         table: pa.Table,
         batch_size: Optional[int] = None) -> List[pa.RecordBatch]:
     return table.to_batches(max_chunksize=batch_size)

예제 #4

파일 보기

 def __init__(self, table: pa.Table):
     self._schema = table.schema
     self._batches = table.to_batches()
     self._offsets = np.cumsum([0] + [len(b) for b in self._batches])

예제 #5

파일 보기

 def __init__(self, table: pa.Table):
     self._schema = table.schema
     self._batches = table.to_batches()
     self._offsets: np.ndarray = np.cumsum([0] +
                                           [len(b) for b in self._batches],
                                           dtype=np.int64)

예제 #6

파일 보기

 def write_table(self, table: pa.Table):
     for batch in table.to_batches():
         self.write(batch)