Python _estimate_segments 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: whylogs.features.autosegmentation

메소드/함수: _estimate_segments

hotexamples.com에서의 예제들: 3

Python _estimate_segments - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 whylogs.features.autosegmentation._estimate_segments에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

def test_estimate_segments():

    df = pd.DataFrame({"target": ["hat", "jug", "hat"], "confidence": [1.2, 3.4, 4.5], "sentiment": ["happy", "sad", "sad"]})
    res = _estimate_segments(df, target_field="confidence", max_segments=4)
    assert res == ["target"]

    res = _estimate_segments(df, target_field="confidence", max_segments=3)
    assert res == ["target"]

    res = _estimate_segments(df, target_field="confidence", max_segments=1)
    assert res == []

예제 #2

파일 보기

파일: session.py 프로젝트: valer-whylabs/whylogs

    def estimate_segments(
        self,
        df: pd.DataFrame,
        name: str,
        target_field: str = None,
        max_segments: int = 30,
        dry_run: bool = False,
    ) -> Optional[Union[List[Dict], List[str]]]:
        """
        Estimates the most important features and values on which to segment
        data profiling using entropy-based methods.

        :param df: the dataframe of data to profile
        :param name: name for discovery in the logger, automatically applied
        to loggers with same dataset_name
        :param target_field: target field (optional)
        :param max_segments: upper threshold for total combinations of segments,
        default 30
        :param dry_run: run calculation but do not write results to metadata
        :return: a list of segmentation feature names
        """
        segments = _estimate_segments(df=df, target_field=target_field, max_segments=max_segments)

        if not dry_run:
            self.metadata_writer.autosegmentation_write(name, segments)

        return segments

예제 #3

파일 보기

def test_estimate_segments_empty():

    df = pd.DataFrame({"target": [], "confidence": [], "sentiment": []})
    res = _estimate_segments(df, target_field="confidence", max_segments=4)
    assert res == []