Python Plot.plot_fairness_group 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: aequitas.plotting

클래스/타입: Plot

메소드/함수: plot_fairness_group

hotexamples.com에서의 예제들: 3

Python Plot.plot_fairness_group - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 aequitas.plotting.Plot.plot_fairness_group에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Plot(10)

plot_fairness_disparity_all(6)

plot_group_metric_all(6)

plot_fairness_group_all(5)

plot_group_metric(5)

plot_disparity(3)

plot_fairness_group(3)

plot_disparity_all(2)

plot_fairness_disparity(1)

예제 #1

파일 보기

파일: ethics_aequitas.py 프로젝트: fogarty-ben/predict-business-closure

def run_aequitas(predictions_data_path):
    '''
	Check for False negative rate, chances of certain group missing out on assistance using aequitas toolkit
	The functions transform the data to make it aequitas complaint and checks for series of bias and fairness metrics
	Input: model prediction path for the selected model (unzip the selected file to run)
	Output: plots saved in charts folder
	'''

    best_model_pred = pd.read_csv(predictions_data_path)

    # Transform data for aquetias module compliance
    aqc = [
        'Other', 'White', 'African American', 'Asian', 'Hispanic',
        'American Indian'
    ]
    aqcol = [
        'White alone_scale', 'Black/AfAmer alone_scale',
        'AmInd/Alaskn alone_scale', 'Asian alone_scale', 'HI alone_scale',
        'Some other race alone_scale', 'Hispanic or Latino_scale'
    ]
    display(aqcol)
    aqcol_label = [
        'no_renew_nextpd', 'pred_class_10%',
        'Median household income (1999 dollars)_scale'
    ] + aqcol
    aqus = best_model_pred[aqcol_label]
    print('Creating classes for racial and income distribution', '\n')

    # Convert to binary
    bin_var = [
        'no_renew_nextpd',
        'pred_class_10%',
    ]
    for var in bin_var:
        aqus[var] = np.where(aqus[var] == True, 1, 0)
    # Rename
    aqus.rename(columns={
        'no_renew_nextpd': 'label_value',
        'pred_class_10%': 'score'
    },
                inplace=True)

    print('Define majority rule defined on relative proportion of the class',
          '\n')
    aqus['race'] = aqus[aqcol].idxmax(axis=1)
    # Use quantile income distribution
    aqus['income'] = pd.qcut(
        aqus['Median household income (1999 dollars)_scale'],
        3,
        labels=["rich", "median", "poor"])

    # Final form
    aqus.drop(aqcol, axis=1, inplace=True)
    aqus.drop(['Median household income (1999 dollars)_scale'],
              axis=1,
              inplace=True)
    aq = aqus.reset_index()
    aq.rename(columns={'index': 'entity_id'}, inplace=True)
    aq['race'] = aq['race'].replace({
        'Some other race alone_scale':
        'Other',
        'White alone_scale':
        'White',
        'Black/AfAmer alone_scale':
        'African American',
        'Asian alone_scale':
        'Asian',
        'HI alone_scale':
        'Hispanic',
        'AmInd/Alaskn alone_scale':
        'American Indian'
    })

    # Consolidate types
    aq['income'] = aq['income'].astype(object)
    aq['entity_id'] = aq['entity_id'].astype(object)
    aq['score'] = aq['score'].astype(object)
    aq['label_value'] = aq['label_value'].astype(object)

    # Distribuion of categories
    aq_palette = sns.diverging_palette(225, 35, n=2)
    by_race = sns.countplot(x="race", data=aq[aq.race.isin(aqc)])
    by_race.set_xticklabels(by_race.get_xticklabels(), rotation=40, ha="right")
    plt.savefig('charts/Racial distribution in data.png')

    # Primary distribuion against score
    aq_palette = sns.diverging_palette(225, 35, n=2)
    by_race = sns.countplot(x="race",
                            hue="score",
                            data=aq[aq.race.isin(aqc)],
                            palette=aq_palette)
    by_race.set_xticklabels(by_race.get_xticklabels(), rotation=40, ha="right")
    # Race
    plt.savefig('charts/race_score.png')
    # Income
    by_inc = sns.countplot(x="income",
                           hue="score",
                           data=aq,
                           palette=aq_palette)
    plt.savefig('charts/income_score.png')

    # Set Group
    g = Group()
    xtab, _ = g.get_crosstabs(aq)

    # False Negative Rates
    aqp = Plot()
    fnr = aqp.plot_group_metric(xtab, 'fnr', min_group_size=0.05)
    p = aqp.plot_group_metric_all(xtab,
                                  metrics=['ppr', 'pprev', 'fnr', 'fpr'],
                                  ncols=4)
    p.savefig('charts/eth_metrics.png')

    # Bias with respect to white rich category
    b = Bias()
    bdf = b.get_disparity_predefined_groups(xtab,
                                            original_df=aq,
                                            ref_groups_dict={
                                                'race': 'White',
                                                'income': 'rich'
                                            },
                                            alpha=0.05,
                                            mask_significance=True)
    bdf.style
    calculated_disparities = b.list_disparities(bdf)
    disparity_significance = b.list_significance(bdf)
    aqp.plot_disparity(bdf,
                       group_metric='fpr_disparity',
                       attribute_name='race',
                       significance_alpha=0.05)
    plt.savefig('charts/disparity.png')

    # Fairness
    hbdf = b.get_disparity_predefined_groups(xtab,
                                             original_df=aq,
                                             ref_groups_dict={
                                                 'race': 'African American',
                                                 'income': 'poor'
                                             },
                                             alpha=0.05,
                                             mask_significance=False)
    majority_bdf = b.get_disparity_major_group(xtab,
                                               original_df=aq,
                                               mask_significance=True)
    min_metric_bdf = b.get_disparity_min_metric(df=xtab, original_df=aq)
    f = Fairness()
    fdf = f.get_group_value_fairness(bdf)
    parity_detrminations = f.list_parities(fdf)
    gaf = f.get_group_attribute_fairness(fdf)
    gof = f.get_overall_fairness(fdf)
    z = aqp.plot_fairness_group(fdf, group_metric='ppr')
    plt.savefig('charts/fairness_overall.png')
    # Checking for False Omission Rate and False Negative Rates
    fg = aqp.plot_fairness_group_all(fdf, metrics=['for', 'fnr'], ncols=2)
    fg.savefig('charts/fairness_metrics.png')

    return None

예제 #2

파일 보기

def audit(df, configs, model_id=1, preprocessed=False):
    """

    :param df:
    :param ref_groups_method:
    :param model_id:
    :param configs:
    :param report:
    :param preprocessed:
    :return:
    """
    if not preprocessed:
        df, attr_cols_input = preprocess_input_df(df)
        if not configs.attr_cols:
            configs.attr_cols = attr_cols_input
    g = Group()
    print('Welcome to Aequitas-Audit')
    print('Fairness measures requested:',
          ','.join(configs.fair_measures_requested))
    groups_model, attr_cols = g.get_crosstabs(
        df,
        score_thresholds=configs.score_thresholds,
        model_id=model_id,
        attr_cols=configs.attr_cols)
    print('audit: df shape from the crosstabs:', groups_model.shape)
    b = Bias()
    # todo move this to the new configs object / the attr_cols now are passed through the configs object...
    ref_groups_method = configs.ref_groups_method
    if ref_groups_method == 'predefined' and configs.ref_groups:
        bias_df = b.get_disparity_predefined_groups(groups_model, df,
                                                    configs.ref_groups)
    elif ref_groups_method == 'majority':
        bias_df = b.get_disparity_major_group(groups_model, df)
    else:
        bias_df = b.get_disparity_min_metric(groups_model, df)
    print('Any NaN?: ', bias_df.isnull().values.any())
    print('bias_df shape:', bias_df.shape)

    aqp = Plot()

    if len(configs.plot_bias_metrics) == 1:
        fig1 = aqp.plot_disparity(bias_df, metrics=configs.plot_bias_metrics)
    elif len(configs.plot_bias_metrics) > 1:
        fig1 = aqp.plot_disparity_all(bias_df,
                                      metrics=configs.plot_bias_metrics)
    if len(configs.plot_bias_disparities) == 1:
        fig2 = aqp.plot_group_metric(bias_df,
                                     metrics=configs.plot_bias_disparities)
    elif len(configs.plot_bias_disparities) > 1:
        fig2 = aqp.plot_group_metric_all(bias_df,
                                         metrics=configs.plot_bias_disparities)

    f = Fairness(tau=configs.fairness_threshold)
    print('Fairness Threshold:', configs.fairness_threshold)
    print('Fairness Measures:', configs.fair_measures_requested)
    group_value_df = f.get_group_value_fairness(
        bias_df, fair_measures_requested=configs.fair_measures_requested)
    group_attribute_df = f.get_group_attribute_fairness(
        group_value_df,
        fair_measures_requested=configs.fair_measures_requested)
    fair_results = f.get_overall_fairness(group_attribute_df)

    if len(configs.plot_bias_metrics) == 1:
        fig3 = aqp.plot_fairness_group(group_value_df,
                                       metrics=configs.plot_bias_metrics)
    elif len(configs.plot_bias_metrics) > 1:
        fig3 = aqp.plot_fairness_group_all(group_value_df,
                                           metrics=configs.plot_bias_metrics)

    if len(configs.plot_bias_disparities) == 1:
        fig4 = aqp.plot_fairness_disparity(
            group_value_df, metrics=configs.plot_bias_disparities)
    elif len(configs.plot_bias_metrics) > 1:
        fig4 = aqp.plot_fairness_disparity_all(
            group_value_df, metrics=configs.plot_bias_disparities)

    print(fair_results)
    report = None
    if configs.report is True:
        report = audit_report_markdown(configs, group_value_df,
                                       f.fair_measures_depend, fair_results)
    return group_value_df, report

예제 #3

파일 보기

파일: student_project.py 프로젝트: mkirmse/nd320-c1-emr-data-starter

# **Question 12**: For the gender and race fields, please plot two metrics that are important for patient selection below and state whether there is a significant bias in your model across any of the groups along with justification for your statement.

# **Answer:** Two main metrics for the patient selection are a) what fraction of unsuitable patient would have been included in the study (1 - precision) and b) what fraction of suitable patients is excluded (false negative rate). With the current model there seems to be a bias for the precision regarding the asian group, which show a much lower precision that the other groups (note that this might partially be caused by small sample size). For b) there are slight difference among race and gender groups but they do not seem to be a significant bias according to Aequitas.

# In[168]:


# how many would be falsly added to the experiment => equally low for all, does not seeom to show bias 
aqp.plot_group_metric(bdf, 'precision', min_group_size=0.01)


# In[169]:


aqp.plot_fairness_group(fdf, group_metric='precision', title=True)


# In[170]:


aqp.plot_group_metric(bdf, 'fnr', min_group_size=0.01)


# In[171]:


aqp.plot_fairness_group(fdf, group_metric='fnr', title=True)


# ## Fairness Analysis Example - Relative to a Reference Group