Python PandasDatasource.get_generator 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: great_expectations.datasource

클래스/타입: PandasDatasource

메소드/함수: get_generator

hotexamples.com에서의 예제들: 3

Python PandasDatasource.get_generator - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 great_expectations.datasource.PandasDatasource.get_generator에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

PandasDatasource(12)

get_batch(6)

build_configuration(3)

get_available_data_asset_names(3)

get_data_asset(3)

get_generator(3)

build_batch_kwargs(2)

_infer_default_options(1)

get_batch_kwargs_generator(1)

process_batch_parameters(1)

예제 #1

파일 보기

파일: test_pandas_datasource.py 프로젝트: nchrist2/great_expectations

def test_standalone_pandas_datasource(test_folder_connection_path):
    datasource = PandasDatasource('PandasCSV',
                                  base_directory=test_folder_connection_path)

    assert datasource.get_available_data_asset_names() == {"default": ["test"]}
    manual_batch_kwargs = PathBatchKwargs(
        path=os.path.join(str(test_folder_connection_path), "test.csv"))

    # Get the default (subdir_path) generator
    generator = datasource.get_generator()
    auto_batch_kwargs = generator.yield_batch_kwargs("test")

    assert manual_batch_kwargs["path"] == auto_batch_kwargs["path"]

    # Include some extra kwargs...
    # Note that we are using get_data_asset NOT get_batch here, since we are standalone (no batch concept)
    dataset = datasource.get_data_asset("test",
                                        generator_name="default",
                                        batch_kwargs=auto_batch_kwargs,
                                        sep=",",
                                        header=0,
                                        index_col=0)
    assert isinstance(dataset, PandasDataset)
    assert (dataset["col_1"] == [1, 2, 3, 4, 5]).all()

    ## A datasource should always return an object with a typed batch_id
    assert isinstance(dataset.batch_kwargs, PathBatchKwargs)
    assert isinstance(dataset.batch_id, BatchId)
    assert isinstance(dataset.batch_fingerprint, BatchFingerprint)

예제 #2

파일 보기

파일: test_datasources.py 프로젝트: scarrucciu/great_expectations

def test_standalone_pandas_datasource(test_folder_connection_path):
    datasource = PandasDatasource('PandasCSV', base_directory=test_folder_connection_path)

    assert datasource.get_available_data_asset_names() == {"default": {"test"}}
    manual_batch_kwargs = datasource.build_batch_kwargs(os.path.join(str(test_folder_connection_path), "test.csv"))

    # Get the default (subdir_path) generator
    generator = datasource.get_generator()
    auto_batch_kwargs = generator.yield_batch_kwargs("test")

    assert manual_batch_kwargs["path"] == auto_batch_kwargs["path"]

    # Include some extra kwargs...
    dataset = datasource.get_batch("test", batch_kwargs=auto_batch_kwargs, sep=",", header=0, index_col=0)
    assert isinstance(dataset, PandasDataset)
    assert (dataset["col_1"] == [1, 2, 3, 4, 5]).all()

예제 #3

파일 보기

파일: test_pandas_datasource.py 프로젝트: rlshuhart/great_expectations

def test_standalone_pandas_datasource(test_folder_connection_path):
    datasource = PandasDatasource('PandasCSV',
                                  generators={
                                      "subdir_reader": {
                                          "class_name":
                                          "SubdirReaderBatchKwargsGenerator",
                                          "base_directory":
                                          test_folder_connection_path
                                      }
                                  })

    assert datasource.get_available_data_asset_names() == {
        'subdir_reader': {
            'names': [('test', 'file')],
            'is_complete_list': True
        }
    }
    manual_batch_kwargs = PathBatchKwargs(
        path=os.path.join(str(test_folder_connection_path), "test.csv"))

    generator = datasource.get_generator("subdir_reader")
    auto_batch_kwargs = generator.yield_batch_kwargs("test")

    assert manual_batch_kwargs["path"] == auto_batch_kwargs["path"]

    # Include some extra kwargs...
    auto_batch_kwargs.update(
        {"reader_options": {
            'sep': ",",
            'header': 0,
            'index_col': 0
        }})
    batch = datasource.get_batch(batch_kwargs=auto_batch_kwargs)
    assert isinstance(batch, Batch)
    dataset = batch.data
    assert (dataset["col_1"] == [1, 2, 3, 4, 5]).all()
    assert len(dataset) == 5

    # A datasource should always return an object with a typed batch_id
    assert isinstance(batch.batch_kwargs, PathBatchKwargs)
    assert isinstance(batch.batch_markers, BatchMarkers)