Python BaggingClassifier.fit_sync 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: sklearn.ensemble

클래스/타입: BaggingClassifier

메소드/함수: fit_sync

hotexamples.com에서의 예제들: 1

Python BaggingClassifier.fit_sync - 1개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 sklearn.ensemble.BaggingClassifier.fit_sync에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

BaggingClassifier(30)

predict(30)

score(30)

fit(30)

predict_proba(30)

decision_function(18)

set_params(13)

predict_log_proba(6)

estimators_(4)

get_params(4)

__init__(3)

_make_estimator(2)

n_jobs(2)

predict_proba_sync(1)

gamma(1)

oob_decision_function_(1)

C(1)

fit_sync(1)

class_weight(1)

_lpz_predict(1)

transform(1)

예제 #1

파일 보기

def run_otto_workflow():
    name = "test1"
    author = "author"
    description = "kaggle-otto-script"
    # Creating a new project
    syncer_obj = Syncer(NewOrExistingProject(name, author, description),
                        NewOrExistingExperiment("expName", "expDesc"),
                        NewExperimentRun("otto test"))

    # Import Data
    # Note: This dataset is not included in the repo because of Kaggle
    # restrictions.
    # It can be downloaded from
    # https://www.kaggle.com/c/otto-group-product-classification-challenge/data
    X = pd.read_csv_sync(DATA_PATH + 'otto-train.csv')
    syncer_obj.add_tag(X, "original otto csv data")
    X = X.drop_sync('id', axis=1)

    syncer_obj.add_tag(X, "dropped id column")
    # Extract target
    # Encode it to make it manageable by ML algo
    y = X.target.values

    y = LabelEncoder().fit_transform_sync(y)

    # Remove target from train, else it's too easy ...
    X = X.drop_sync('target', axis=1)

    syncer_obj.add_tag(X, "data with dropped id and target columns")

    # Split Train / Test
    x_train, x_test, y_train, y_test = cross_validation.train_test_split_sync(
        X, y, test_size=0.20, random_state=36)

    syncer_obj.add_tag(x_test, "testing data")
    syncer_obj.add_tag(x_train, "training data")
    # First, we will train and apply a Random Forest WITHOUT calibration
    # we use a BaggingClassifier to make 5 predictions, and average
    # because that's what CalibratedClassifierCV do behind the scene,
    # and we want to compare things fairly, i.e. be sure that averaging several
    # models
    # is not what explains a performance difference between no calibration,
    # and calibration.

    clf = RandomForestClassifier(n_estimators=50, n_jobs=-1)

    clfbag = BaggingClassifier(clf, n_estimators=5)
    clfbag.fit_sync(x_train, y_train)

    y_preds = clfbag.predict_proba_sync(x_test)

    SyncableMetrics.compute_metrics(clfbag,
                                    log_loss,
                                    y_test,
                                    y_preds,
                                    x_test,
                                    "",
                                    "",
                                    eps=1e-15,
                                    normalize=True)
    # print("loss WITHOUT calibration : ", log_loss(
    #     ytest, ypreds, eps=1e-15, normalize=True))

    # Now, we train and apply a Random Forest WITH calibration
    # In our case, 'isotonic' worked better than default 'sigmoid'
    # This is not always the case. Depending of the case, you have to test the
    # two possibilities

    clf = RandomForestClassifier(n_estimators=50, n_jobs=-1)
    calibrated_clf = CalibratedClassifierCV(clf, method='isotonic', cv=5)
    calibrated_clf.fit_sync(x_train, y_train)
    y_preds = calibrated_clf.predict_proba_sync(x_test)
    SyncableMetrics.compute_metrics(calibrated_clf,
                                    log_loss,
                                    y_test,
                                    y_preds,
                                    x_test,
                                    "",
                                    "",
                                    eps=1e-15,
                                    normalize=True)

    # print("loss WITH calibration : ", log_loss(
    #     ytest, ypreds, eps=1e-15, normalize=True))

    print(" ")
    print("Conclusion : in our case, calibration improved"
          "performance a lot ! (reduced loss)")
    syncer_obj.sync()
    return syncer_obj, x_train, x_test