Example #1
0
    def test_majority_label_vote(self):
        L = np.array([[0, 1, 0], [0, 1, 0], [1, 0, 0], [-1, -1, 1]])
        ml_voter = MajorityLabelVoter()
        Y_p = ml_voter.predict_proba(L)

        Y_p_true = np.array([[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1.0]])
        np.testing.assert_array_almost_equal(Y_p, Y_p_true)
Example #2
0
def get_majority_vote_label(train_df, lfs, labels):
    applier = PandasLFApplier(
        [labeling_function(name=lf.__name__)(lf) for lf in lfs])
    label_model = LabelModel(cardinality=len(labels), verbose=True)
    L_train = applier.apply(df=train_df)
    majority_model = MajorityLabelVoter(cardinality=len(labels))
    preds_train = majority_model.predict(L=L_train)

    non_abstain_idxs = np.argwhere(preds_train >= 0).flatten()
    df_filtered = train_df.iloc[non_abstain_idxs]
    probs_filtered = preds_train[non_abstain_idxs]
    return df_filtered, probs_filtered
Example #3
0
def apply_lf_on_data(df_train,df_dev,sentences_number):
    """
    This function apply the labeling functions (from labeled_function.py)
    on given train data frame.
    Other parameters: df_dev (for further developing the LFs) and sentences_number for inner use.
    Return the train df with the tagging.
    """
    print("")
    print("Labeling Functions:")

    # Y_dev = df_dev.tag.values
    lfs = [labeled_function.masechet_then_parans, labeled_function.perek_then_parans,
           labeled_function.daf_in_parntes, labeled_function.no_double_parans,
           labeled_function.no_mishna]
    applier = PandasLFApplier(lfs=lfs)

    print("-Applying the labeling functions...")
    l_train = applier.apply(df=df_train)
    # l_dev = applier.apply(df=df_dev)

    print_analysis(l_train,lfs)

    print("-Applying the MajorityLabelVoter...")
    majority_model = MajorityLabelVoter()
    preds_train = majority_model.predict(L=l_train)

    #put predicted labels in df train
    print("-Removing unnecessary n-grams...")
    df_train['tag'] = preds_train
    for i in range(sentences_number):
        df_filter_by_sentences = df_train.loc[df_train['sentence_index'] == i]
        df_filter = df_filter_by_sentences.loc[df_filter_by_sentences['tag'] == 1]
        # this section handles cases of positively tagged ngram within a bigger positively tagged ngram, and removes it.
        for row_checked in df_filter.index:
            for row_other in df_filter.index:
                if df_filter['n_gram_id'][row_checked] != df_filter['n_gram_id'][row_other] and \
                        df_filter['text'][row_checked] in df_filter['text'][row_other]:
                    df_train = df_train[df_train.n_gram_id != df_filter['n_gram_id'][row_checked]]
                    break

    print("-Dropping the abstained and extra columns...")
    df_train = df_train.drop(["sentence_index","n_gram_id"],axis=1)
    df_train = df_train[df_train['tag'] != ABSTAIN]
    print("DONE")
    return df_train
Example #4
0
def compute_accuracy(support, recall):
	return np.sum(support * recall) / np.sum(support)


path_dir = sys.argv[1]  # path where data pickles are stored
num_classes = int(sys.argv[2]) # number of classes (depends on the dataset)
default_class = sys.argv[3]  # default class (can be provided as None if no default class exists)
							 # usually the most frequent class in a dataset with high imbalance

if default_class=="None":
	default_class=None
else:
	default_class=int(default_class) 

# snorkel's majority voting model
majority_model = MajorityLabelVoter(cardinality=num_classes)

# load unlabeled data
U_file = open(os.path.join(path_dir,"U_processed.p"),"rb")
U_x = pickle.load(U_file)
U_l = pickle.load(U_file)
U_m = pickle.load(U_file)
U_L = pickle.load(U_file)
U_d = pickle.load(U_file)
U_lsnork = conv_l_to_lsnork(U_l,U_m)
#indices of instances where atleast one rule fired
U_fired_idx = [i for i,item in enumerate(U_m) if sum(item)>0] 

# load test data
test_file = open(os.path.join(path_dir,"test_processed.p"),"rb")
test_x = pickle.load(test_file)
Example #5
0
plot_label_frequency(L_train)

# %% [markdown] {"tags": ["md-exclude"]}
# We see that over half of our `train` dataset data points have 2 or fewer labels from LFs.
# Fortunately, the signal we do have can be used to train a classifier over the comment text directly, allowing it to generalize beyond what we've specified via our LFs.

# %% [markdown]
# Our goal is now to convert the labels from our LFs into a single _noise-aware_ probabilistic (or confidence-weighted) label per data point.
# A simple baseline for doing this is to take the majority vote on a per-data point basis: if more LFs voted SPAM than HAM, label it SPAM (and vice versa).
# We can test this with the
# [`MajorityLabelVoter` baseline model](https://snorkel.readthedocs.io/en/master/packages/_autosummary/labeling/snorkel.labeling.MajorityLabelVoter.html#snorkel.labeling.MajorityLabelVoter).

# %% {"tags": ["md-exclude-output"]}
from snorkel.labeling import MajorityLabelVoter

majority_model = MajorityLabelVoter()
preds_train = majority_model.predict(L=L_train)

# %%
preds_train

# %% [markdown]
# However, as we can clearly see by looking the summary statistics of our LFs in the previous section, they are not all equally accurate, and should not be treated identically. In addition to having varied accuracies and coverages, LFs may be correlated, resulting in certain signals being overrepresented in a majority-vote-based model. To handle these issues appropriately, we will instead use a more sophisticated Snorkel `LabelModel` to combine the outputs of the LFs.
#
# This model will ultimately produce a single set of noise-aware training labels, which are probabilistic or confidence-weighted labels. We will then use these labels to train a classifier for our task. For more technical details of this overall approach, see our [NeurIPS 2016](https://arxiv.org/abs/1605.07723) and [AAAI 2019](https://arxiv.org/abs/1810.02840) papers. For more info on the API, see the [`LabelModel` documentation](https://snorkel.readthedocs.io/en/master/packages/_autosummary/labeling/snorkel.labeling.LabelModel.html#snorkel.labeling.LabelModel).
#
# Note that no gold labels are used during the training process.
# The only information we need is the label matrix, which contains the output of the LFs on our training set.
# The `LabelModel` is able to learn weights for the labeling functions using only the label matrix as input.
# We also specify the `cardinality`, or number of classes.
# The `LabelModel` trains much more quickly than typical discriminative models since we only need the label matrix as input.
Example #6
0
    def train_f_on_d_U(self, datafeeder, num_epochs, loss_type):
        sess = self.hls.sess

        total_batch = datafeeder.get_batches_per_epoch(f_d_U)
        batch_size = datafeeder.get_batch_size(f_d_U)

        if loss_type == 'pure-likelihood':
            train_op = self.hls.f_d_U_pure_likelihood_op
            loss_op = self.hls.f_d_U_pure_likelihood_loss
        elif loss_type == 'implication':
            train_op = self.hls.f_d_U_implication_op
            loss_op = self.hls.f_d_U_implication_loss
        elif loss_type == 'pr_loss':
            train_op = self.hls.pr_train_op
            loss_op = self.hls.pr_loss
        elif loss_type == 'gcross':
            train_op = self.hls.gcross_train_op
            loss_op = self.hls.gcross_loss
        elif loss_type == 'gcross_snorkel':
            train_op = self.hls.snork_gcross_train_op
            loss_op = self.hls.snork_gcross_loss
        elif loss_type == 'learn2reweight':
            train_op = self.hls.l2r_train_op
            loss_op = self.hls.l2r_loss
        elif loss_type == 'label_snorkel':
            train_op = self.hls.label_snorkel_train_op
            loss_op = self.hls.label_snorkel_loss
        elif loss_type == 'pure_snorkel':
            train_op = self.hls.pure_snorkel_train_op
            loss_op = self.hls.pure_snorkel_loss
        else:
            raise ValueError('Invalid loss type %s' % loss_type)

        best_saver_f_d_U = self.hls.best_savers.get_best_saver(f_d_U)
        metrics_dict = {}  #{'config': self.config}

        if 'label_snorkel' == self.config.mode or 'pure_snorkel' == self.config.mode or 'gcross_snorkel' == self.config.mode:
            label_model = LabelModel(cardinality=self.hls.num_classes,
                                     verbose=True)
            if os.path.isfile(
                    os.path.join(self.config.data_dir, "saved_label_model")):
                label_model = label_model.load(
                    os.path.join(self.config.data_dir, "saved_label_model"))
            else:
                print("LABEL MODEL NOT SAVED")
                exit()
        if 'gcross' in self.config.mode or 'learn2reweight' in self.config.mode:
            majority_model = MajorityLabelVoter(
                cardinality=self.hls.num_classes)

        with sess.as_default():
            print("Optimization started for f_d_U with %s loss!" % loss_type)
            print("Batch size: %d!" % batch_size)
            print("Batches per epoch : %d!" % total_batch)
            print("Number of epochs: %d!" % num_epochs)
            # Training cycle
            iteration = 0
            global_step = 0
            patience = 0
            for epoch in range(num_epochs):
                avg_epoch_cost = 0.

                for i in range(total_batch):
                    batch_x, batch_l, batch_m, batch_L, batch_d, batch_r =\
                            datafeeder.get_f_d_U_next_batch()

                    feed_dict = {
                        self.hls.f_d_U_adam_lr: self.config.f_d_U_adam_lr,
                        self.hls.f_d_U_x: batch_x,
                        self.hls.f_d_U_l: batch_l,
                        self.hls.f_d_U_m: batch_m,
                        self.hls.f_d_U_L: batch_L,
                        self.hls.f_d_U_d: batch_d,
                        self.hls.f_d_U_r: batch_r
                    }

                    batch_lsnork = conv_l_to_lsnork(batch_l, batch_m)

                    if 'label_snorkel' == self.config.mode or 'pure_snorkel' == self.config.mode or 'gcross_snorkel' == self.config.mode:
                        batch_snork_L = label_model.predict_proba(
                            L=batch_lsnork)  #snorkel_probs
                        feed_dict[self.hls.f_d_U_snork_L] = batch_snork_L

                    if 'gcross' == self.config.mode or 'learn2reweight' == self.config.mode:
                        batch_snork_L = majority_model.predict(
                            L=batch_lsnork)  #majority votes
                        batch_snork_L = np.eye(
                            self.hls.num_classes)[batch_snork_L]  #one hot rep
                        feed_dict[self.hls.f_d_U_snork_L] = batch_snork_L

                    merge_dict_a_into_b(self.hls.dropout_train_dict, feed_dict)
                    # Run optimization op (backprop) and cost op (to get loss value)
                    _, cost, num_d, f_d_U_global_step = sess.run(
                        [
                            train_op, loss_op, self.hls.f_d_U_num_d,
                            self.hls.f_d_U_global_step
                        ],
                        feed_dict=feed_dict)

                    global_epoch = f_d_U_global_step / total_batch
                    # This assertion is valid only if true U labels are available but not being used such as for
                    # synthetic data.
                    assert np.all(batch_L <= self.hls.num_classes)

                    avg_epoch_cost += cost / total_batch
                    cost1 = (avg_epoch_cost * total_batch) / (i + 1)
                    global_step += 1

                # Compute and report metrics, update checkpoints after each epoch
                print("\n========== epoch : {} ============\n".format(epoch))
                print("cost: {}\n".format(cost1))
                print("patience: {}\n".format(patience))
                precision, recall, f1_score, support = self.hls.test.test_f(
                    datafeeder)
                self.compute_f_d_metrics(metrics_dict, precision, recall,
                                         f1_score, support, global_epoch,
                                         f_d_U_global_step)
                print("\nmetrics_dict: ", metrics_dict)
                print()
                self.report_f_d_perfs_to_tensorboard(cost1, metrics_dict,
                                                     global_step)
                did_improve = self.maybe_save_metrics_dict(f_d_U, metrics_dict)
                if did_improve:
                    patience = 0  #rest patience if primary metric improved
                else:
                    patience += 1
                    if patience > self.config.early_stopping_p:
                        print("bye! stopping early!......")
                        break
                # Save checkpoint
                print()
                self.hls.mru_saver.save(global_step)
                print()
                best_saver_f_d_U.save_if_best(
                    metrics_dict[self.config.f_d_primary_metric])
                print()
                global_step += 1
            print("Optimization Finished for f_d_U!")
Example #7
0
def get_role_probs(lf_train: pd.DataFrame,
                   filter_abstains: bool = False,
                   lfs: Optional[List[labeling_function]] = None,
                   lf_dev: pd.DataFrame = None,
                   seed: Optional[int] = None,
                   tmp_path: Union[str, Path] = None,
                   use_majority_label_voter=False) -> pd.DataFrame:
    """
    Takes "raw" data frame, builds argument role examples, (trains LabelModel), calculates event_argument_probs
    and returns merged argument role examples with event_argument_probs.
    :param use_majority_label_voter: Whether to use a majority label voter instead of the snorkel label model
    :param seed: Seed for use in label model (mu initialization)
    :param filter_abstains: Filters rows where all labeling functions abstained
    :param lf_train: Training dataset which will be labeled using Snorkel
    :param lfs: List of labeling functions
    :param lf_dev: Optional development dataset that can be used to set a prior for the class balance
    :param tmp_path: Path to temporarily store variables that are shared during random repeats
    :return: Labeled lf_train, labeling function applier, label model
    """
    df_train, L_train = None, None
    df_dev, Y_dev, L_dev = None, None, None
    tmp_train_path, tmp_dev_path = None, None

    # For random repeats try to load pickled variables from first run as they are shared
    if tmp_path:
        tmp_train_path = Path(tmp_path).joinpath("role_train.pkl")
        os.makedirs(os.path.dirname(tmp_train_path), exist_ok=True)
        if tmp_train_path.exists():
            with open(tmp_train_path, 'rb') as pickled_train:
                df_train, L_train = pickle.load(pickled_train)
        if lf_dev is not None:
            tmp_dev_path = Path(tmp_path).joinpath("role_dev.pkl")
            os.makedirs(os.path.dirname(tmp_dev_path), exist_ok=True)
            if tmp_dev_path.exists():
                with open(tmp_dev_path, 'rb') as pickled_dev:
                    df_dev, Y_dev, L_dev = pickle.load(pickled_dev)

    if lfs is None:
        lfs = get_role_list_lfs()
    applier = PandasLFApplier(lfs)

    if L_train is None or df_train is None:
        df_train, _ = build_event_role_examples(lf_train)
        logger.info("Running Event Role Labeling Function Applier")
        L_train = applier.apply(df_train)
        if tmp_path:
            with open(tmp_train_path, 'wb') as pickled_train:
                pickle.dump((df_train, L_train), pickled_train)
    if lf_dev is not None and any(element is None
                                  for element in [df_dev, Y_dev, L_dev]):
        df_dev, Y_dev = build_event_role_examples(lf_dev)
        logger.info("Running Event Role Labeling Function Applier on dev set")
        L_dev = applier.apply(df_dev)
        if tmp_path:
            with open(tmp_dev_path, 'wb') as pickled_dev:
                pickle.dump((df_dev, Y_dev, L_dev), pickled_dev)

    if use_majority_label_voter:
        logger.info(
            "Using MajorityLabelVoter to calculate role class probabilities")
        label_model = MajorityLabelVoter(cardinality=11)
    else:
        label_model = LabelModel(cardinality=11, verbose=True)
        logger.info(
            "Fitting LabelModel on the data and predicting role class probabilities"
        )
        if seed:
            label_model.fit(L_train=L_train,
                            n_epochs=5000,
                            log_freq=500,
                            seed=seed,
                            Y_dev=Y_dev)
        else:
            label_model.fit(L_train=L_train,
                            n_epochs=5000,
                            log_freq=500,
                            Y_dev=Y_dev)

    # Evaluate label model on development data
    if df_dev is not None and Y_dev is not None:
        metrics = ["accuracy", "f1_micro", "f1_macro"]
        logger.info("Evaluate on the dev set")
        label_model_metrics = label_model.score(L=L_dev,
                                                Y=Y_dev,
                                                tie_break_policy="random",
                                                metrics=metrics)
        if use_majority_label_voter:
            logger.info('Role Majority Label Voter Metrics')
        else:
            logger.info('Role Label Model Metrics')
        logger.info(
            f"{'Accuracy:':<25} {label_model_metrics['accuracy'] * 100:.1f}%")
        logger.info(
            f"{'F1 (micro averaged):':<25} {label_model_metrics['f1_micro'] * 100:.1f}%"
        )
        logger.info(
            f"{'F1 (macro averaged):':<25} {label_model_metrics['f1_macro'] * 100:.1f}%"
        )

    event_role_probs = label_model.predict_proba(L_train)

    if filter_abstains:
        df_train_filtered, probs_train_filtered = filter_unlabeled_dataframe(
            X=df_train, y=event_role_probs, L=L_train)

        merged_event_role_examples = merge_event_role_examples(
            df_train_filtered, probs_train_filtered)
    else:
        # Multiplies probabilities of abstains with zero so that the example is treated as padding in the end model
        merged_event_role_examples = merge_event_role_examples(
            df_train, utils.zero_out_abstains(event_role_probs, L_train))
    return merged_event_role_examples
plot_label_frequency(L_train)

# %% [markdown] {"tags": ["md-exclude"]}
# We see that over half of our `train` dataset data points have 2 or fewer labels from LFs.
# Fortunately, the signal we do have can be used to train a classifier over the comment text directly, allowing it to generalize beyond what we've specified via our LFs.

# %% [markdown]
# Our goal is now to convert the labels from our LFs into a single _noise-aware_ probabilistic (or confidence-weighted) label per data point.
# A simple baseline for doing this is to take the majority vote on a per-data point basis: if more LFs voted SPAM than HAM, label it SPAM (and vice versa).
# We can test this with the
# [`MajorityLabelVoter` baseline model](https://snorkel.readthedocs.io/en/master/packages/_autosummary/labeling/snorkel.labeling.MajorityLabelVoter.html#snorkel.labeling.MajorityLabelVoter).

# %% {"tags": ["md-exclude-output"]}
from snorkel.labeling import MajorityLabelVoter

majority_model = MajorityLabelVoter()
preds_train = majority_model.predict(L=L_train)

# %%
preds_train

# %% [markdown]
# However, as we can clearly see by looking the summary statistics of our LFs in the previous section, they are not all equally accurate, and should not be treated identically. In addition to having varied accuracies and coverages, LFs may be correlated, resulting in certain signals being overrepresented in a majority-vote-based model. To handle these issues appropriately, we will instead use a more sophisticated Snorkel `LabelModel` to combine the outputs of the LFs.
#
# This model will ultimately produce a single set of noise-aware training labels, which are probabilistic or confidence-weighted labels. We will then use these labels to train a classifier for our task. For more technical details of this overall approach, see our [NeurIPS 2016](https://arxiv.org/abs/1605.07723) and [AAAI 2019](https://arxiv.org/abs/1810.02840) papers. For more info on the API, see the [`LabelModel` documentation](https://snorkel.readthedocs.io/en/master/packages/_autosummary/labeling/snorkel.labeling.LabelModel.html#snorkel.labeling.LabelModel).
#
# Note that no gold labels are used during the training process.
# The only information we need is the label matrix, which contains the output of the LFs on our training set.
# The `LabelModel` is able to learn weights for the labeling functions using only the label matrix as input.
# We also specify the `cardinality`, or number of classes.
# The `LabelModel` trains much more quickly than typical discriminative models since we only need the label matrix as input.
plot_label_frequency(L_train)

# %% [markdown] {"tags": ["md-exclude"]}
# We see that over half of our `train` dataset data points have 2 or fewer labels from LFs.
# Fortunately, the labels we do have can be used to train a classifier over the comment text directly, allowing this final machine learning model to generalize beyond what our labeling functions labeling.

# %% [markdown]
# Our goal is now to convert the labels from our LFs into a single _noise-aware_ probabilistic (or confidence-weighted) label per data point.
# A simple baseline for doing this is to take the majority vote on a per-data point basis: if more LFs voted SPAM than HAM, label it SPAM (and vice versa).
# We can test this with the
# [`MajorityLabelVoter` baseline model](https://snorkel.readthedocs.io/en/master/packages/_autosummary/labeling/snorkel.labeling.MajorityLabelVoter.html#snorkel.labeling.MajorityLabelVoter).

# %% {"tags": ["md-exclude-output"]}
from snorkel.labeling import MajorityLabelVoter

majority_model = MajorityLabelVoter()
preds_train = majority_model.predict(L=L_train)

# %%
preds_train

# %% [markdown]
# However, as we can see from the summary statistics of our LFs in the previous section, they have varying properties and should not be treated identically. In addition to having varied accuracies and coverages, LFs may be correlated, resulting in certain signals being overrepresented in a majority-vote-based model. To handle these issues appropriately, we will instead use a more sophisticated Snorkel `LabelModel` to combine the outputs of the LFs.
#
# This model will ultimately produce a single set of noise-aware training labels, which are probabilistic or confidence-weighted labels. We will then use these labels to train a classifier for our task. For more technical details of this overall approach, see our [NeurIPS 2016](https://arxiv.org/abs/1605.07723) and [AAAI 2019](https://arxiv.org/abs/1810.02840) papers. For more info on the API, see the [`LabelModel` documentation](https://snorkel.readthedocs.io/en/master/packages/_autosummary/labeling/snorkel.labeling.LabelModel.html#snorkel.labeling.LabelModel).
#
# Note that no gold labels are used during the training process.
# The only information we need is the label matrix, which contains the output of the LFs on our training set.
# The `LabelModel` is able to learn weights for the labeling functions using only the label matrix as input.
# We also specify the `cardinality`, or number of classes.