Python Annotations.compute_ambiguity Beispiele

Programmiersprache: Python

Namespace / Paketname: medacy.data.annotations

Klasse / Typ: Annotations

Methode / Funktion: compute_ambiguity

Beispiele auf hotexamples.com: 3

Python Annotations.compute_ambiguity - 3 Beispiele gefunden. Dies sind die am besten bewerteten Python Beispiele für die medacy.data.annotations.Annotations.compute_ambiguity, die aus Open Source-Projekten extrahiert wurden. Sie können Beispiele bewerten, um die Qualität der Beispiele zu verbessern.

Häufig verwendete Methoden

Anzeigen Verbergen

Annotations(30)

compute_ambiguity(3)

add_entity(2)

compute_confusion_matrix(2)

difference(2)

get_entity_annotations(2)

compute_counts(1)

get_labels(1)

intersection(1)

to_ann(1)

Beispiel #1

Datei anzeigen

Datei: test_annotation.py Projekt: zazabar/medaCy

 def test_compute_ambiguity(self):
     ann_1 = Annotations(self.ann_path_1)
     ann_1_copy = Annotations(self.ann_path_1)
     ambiguity = ann_1.compute_ambiguity(ann_1_copy)
     # The number of overlapping spans for the selected ann file is known to be 25
     self.assertEqual(25, len(ambiguity))
     # Manually introduce ambiguity by changing the name of an entity in the copy
     first_tuple = ann_1_copy.annotations[0]
     ann_1_copy.annotations[0] = ('different_name', first_tuple[1], first_tuple[2], first_tuple[3])
     ambiguity = ann_1.compute_ambiguity(ann_1_copy)
     # See if this increased the ambiguity score by one
     self.assertEqual(26, len(ambiguity))

Beispiel #2

Datei anzeigen

Datei: dataset.py Projekt: zazabar/medaCy

    def compute_ambiguity(self, dataset):
        """
        Finds occurrences of spans from 'dataset' that intersect with a span from this annotation but do not have this spans label.
        label. If 'dataset' comprises a models predictions, this method provides a strong indicators
        of a model's in-ability to dis-ambiguate between entities. For a full analysis, compute a confusion matrix.

        :param dataset: a Dataset object containing a predicted version of this dataset.
        :param leniency: a floating point value between [0,1] defining the leniency of the character spans to count as different. A value of zero considers only exact character matches while a positive value considers entities that differ by up to :code:`ceil(leniency * len(span)/2)` on either side.
        :return: a dictionary containing the ambiguity computations on each gold, predicted file pair
        """
        if not isinstance(dataset, Dataset):
            raise ValueError("dataset must be instance of Dataset")

        # verify files are consistent
        diff = set(file.ann_path.split(os.sep)[-1] for file in self) - set(file.ann_path.split(os.sep)[-1] for file in dataset)
        if diff:
            raise ValueError("Dataset of predictions is missing the files: " + str(list(diff)))

        #Dictionary storing ambiguity over dataset
        ambiguity_dict = {}

        for gold_data_file in self:
            prediction_iter = iter(dataset)
            prediction_data_file = next(prediction_iter)
            while str(gold_data_file) != str(prediction_data_file):
                prediction_data_file = next(prediction_iter)

            gold_annotation = Annotations(gold_data_file.ann_path)
            pred_annotation = Annotations(prediction_data_file.ann_path)

            # compute matrix on the Annotation file level
            ambiguity_dict[str(gold_data_file)] = gold_annotation.compute_ambiguity(pred_annotation)


        return ambiguity_dict

Beispiel #3

Datei anzeigen

Datei: dataset.py Projekt: veeravalliss/medaCy

    def compute_ambiguity(self, dataset):
        """
        Finds occurrences of spans from 'dataset' that intersect with a span from this annotation but do not have this spans label.
        label. If 'dataset' comprises a models predictions, this method provides a strong indicators
        of a model's in-ability to dis-ambiguate between entities. For a full analysis, compute a confusion matrix.

        :param dataset: a Dataset object containing a predicted version of this dataset.
        :return: a dictionary containing the ambiguity computations on each gold, predicted file pair
        """
        if not isinstance(dataset, Dataset):
            raise ValueError("dataset must be instance of Dataset")

        # verify files are consistent
        diff = {d.file_name for d in self} - {d.file_name for d in dataset}
        if diff:
            raise ValueError(
                f"Dataset of predictions is missing the files: {repr(diff)}")

        # Dictionary storing ambiguity over dataset
        ambiguity_dict = {}

        for gold_data_file in self:
            prediction_iter = iter(dataset)
            prediction_data_file = next(prediction_iter)
            while str(gold_data_file) != str(prediction_data_file):
                prediction_data_file = next(prediction_iter)

            gold_annotation = Annotations(gold_data_file.ann_path)
            pred_annotation = Annotations(prediction_data_file.ann_path)

            # compute matrix on the Annotation file level
            ambiguity_dict[str(
                gold_data_file)] = gold_annotation.compute_ambiguity(
                    pred_annotation)

        return ambiguity_dict