Python LuxDataFrame.cardinality 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: lux.core.frame

클래스/타입: LuxDataFrame

메소드/함수: cardinality

hotexamples.com에서의 예제들: 2

Python LuxDataFrame.cardinality - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 lux.core.frame.LuxDataFrame.cardinality에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

LuxDataFrame(6)

_compiled(3)

_sampled(3)

_min_max(2)

_pandas_only(2)

cardinality(2)

data_model(2)

data_model_lookup(2)

_approx_sample(1)

_data_type(1)

_length(1)

current_vis_to_JSON(1)

예제 #1

파일 보기

    def compute_stats(self, ldf: LuxDataFrame):
        # precompute statistics
        ldf.unique_values = {}
        ldf._min_max = {}
        ldf.cardinality = {}

        for attribute in ldf.columns:

            if (isinstance(attribute, pd._libs.tslibs.timestamps.Timestamp)):
                # If timestamp, make the dictionary keys the _repr_ (e.g., TimeStamp('2020-04-05 00.000')--> '2020-04-05')
                attribute_repr = str(attribute._date_repr)
            else:
                attribute_repr = attribute
            if ldf.dtypes[
                    attribute] != "float64":  # and not pd.api.types.is_datetime64_ns_dtype(self.dtypes[attribute]):
                ldf.unique_values[attribute_repr] = list(
                    ldf[attribute].unique())
                ldf.cardinality[attribute_repr] = len(
                    ldf.unique_values[attribute])
            else:
                ldf.cardinality[
                    attribute_repr] = 999  # special value for non-numeric attribute
            if ldf.dtypes[attribute] == "float64" or ldf.dtypes[
                    attribute] == "int64":
                ldf._min_max[attribute_repr] = (ldf[attribute].min(),
                                                ldf[attribute].max())
        if (ldf.index.dtype != 'int64'):
            index_column_name = ldf.index.name
            ldf.unique_values[index_column_name] = list(ldf.index)
            ldf.cardinality[index_column_name] = len(ldf.index)

예제 #2

파일 보기

파일: PandasExecutor.py 프로젝트: whmz/lux

    def compute_stats(self, ldf: LuxDataFrame):
        # precompute statistics
        ldf.unique_values = {}
        ldf._min_max = {}
        ldf.cardinality = {}

        for attribute in ldf.columns:

            if isinstance(attribute, pd._libs.tslibs.timestamps.Timestamp):
                # If timestamp, make the dictionary keys the _repr_ (e.g., TimeStamp('2020-04-05 00.000')--> '2020-04-05')
                attribute_repr = str(attribute._date_repr)
            else:
                attribute_repr = attribute

            ldf.unique_values[attribute_repr] = list(
                ldf[attribute_repr].unique())
            ldf.cardinality[attribute_repr] = len(
                ldf.unique_values[attribute_repr])

            if pd.api.types.is_float_dtype(
                    ldf.dtypes[attribute]) or pd.api.types.is_integer_dtype(
                        ldf.dtypes[attribute]):
                ldf._min_max[attribute_repr] = (
                    ldf[attribute].min(),
                    ldf[attribute].max(),
                )

        if not pd.api.types.is_integer_dtype(ldf.index):
            index_column_name = ldf.index.name
            ldf.unique_values[index_column_name] = list(ldf.index)
            ldf.cardinality[index_column_name] = len(ldf.index)