Python TargetTextCollection.number_targets Beispiele

Programmiersprache: Python

Namespace / Paketname: target_extraction.data_types

Methode / Funktion: number_targets

Beispiele auf hotexamples.com: 5

Python TargetTextCollection.number_targets - 5 Beispiele gefunden. Dies sind die am besten bewerteten Python Beispiele für die target_extraction.data_types.TargetTextCollection.number_targets, die aus Open Source-Projekten extrahiert wurden. Sie können Beispiele bewerten, um die Qualität der Beispiele zu verbessern.

Häufig verwendete Methoden

Anzeigen Verbergen

TargetTextCollection(30)

load_json(21)

add(13)

number_targets(5)

dict_iterator(3)

name(3)

combine(2)

exact_match_score(2)

samples_with_targets(2)

metadata(2)

pos_text(1)

one_sentiment_text(1)

one_sample_per_span(1)

keys(1)

items(1)

in_order(1)

from_json(1)

force_targets(1)

sanitize(1)

Beispiel #1

Datei anzeigen

Datei: text_classification_dataset.py Projekt: apmoore1/tdsa_comparisons

def get_target_sentiment_distribution(dataset: TargetTextCollection) -> Dict[str, float]:
    target_sentiment_distribution = Counter()
    for target_text in dataset.values():
        target_sentiment_distribution.update(target_text['target_sentiments'])
    for key, value in target_sentiment_distribution.items():
        target_sentiment_distribution[key] = round((value / dataset.number_targets()), 2) * 100
    return dict(target_sentiment_distribution)

Beispiel #2

Datei anzeigen

def average_target_per_sentences(collection: TargetTextCollection, 
                                 sentence_must_contain_targets: bool) -> float:
    '''
    :param collection: Collection to calculate average target per sentence (ATS) 
                       on.
    :param sentence_must_contain_targets: Whether or not the sentences within the 
                                          collection must contains at least one 
                                          target. This filtering would affect 
                                          the value of the dominator stated in 
                                          the returns.  
    :returns: The ATS for the given collection. Which is: 
              Number of targets / number of sentences
    '''
    number_targets = float(collection.number_targets())
    if sentence_must_contain_targets:
        number_sentences = len(collection.samples_with_targets())
    else:
        number_sentences = len(collection)
    return number_targets / float(number_sentences)

Beispiel #3

Datei anzeigen

def get_sentiment_counts(collection: TargetTextCollection,
                         sentiment_key: str,
                         normalised: bool = True) -> Dict[str, float]:
    '''
    :param collection: The collection containing the sentiment data
    :param sentiment_key: The key in each TargetText within the collection that 
                          contains the True sentiment value.
    :param normalised: Whether to normalise the values in the dictionary 
                       by the number of targets in the collection.
    :returns: A dictionary where keys are sentiment values and the keys 
              are the number of times they occur in the collection.
    '''
    sentiment_count = defaultdict(lambda: 0)
    for target_text in collection.values():
        if target_text[sentiment_key] is not None:
            for sentiment_value in target_text[sentiment_key]:
                sentiment_count[sentiment_value] += 1
    number_targets = collection.number_targets()
    assert number_targets == sum(sentiment_count.values())
    if normalised:
        for sentiment, count in sentiment_count.items():
            sentiment_count[sentiment] = float(count) / float(number_targets)
    return dict(sentiment_count)

Beispiel #4

Datei anzeigen

Datei: create_splits.py Projekt: apmoore1/target-extraction

def dataset_length(task: str, dataset: TargetTextCollection) -> int:
    if task == 'extraction':
        return len(dataset)
    elif task == 'sentiment':
        return dataset.number_targets()
    return 0

Beispiel #5

Datei anzeigen

Datei: re_format_augmented_dataset.py Projekt: apmoore1/tdsa_augmentation

                all_targets.append(aug_target_object)
            except OverLappingTargetsError:
                # This needs to be skipped as when targets overlap it is very 
                # difficult to easily calculate all possible span offsets 
                # for all other targets. Furthermore there are only 3 
                # occasion this happens so it is a very rare occurrence.
                continue
    return all_targets

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument("augmented_dataset", type=parse_path, 
                        help='File path the augmented dataset')
    parser.add_argument("save_fp", type=parse_path, 
                        help='File path to save the new re-formated augmented dataset')
    args = parser.parse_args()

    augmented_data_fp = args.augmented_dataset
    save_fp = args.save_fp

    augmented_dataset = TargetTextCollection.load_json(augmented_data_fp)
    new_dataset = []

    for target_object in augmented_dataset.values():
        augmented_targets = add_augmented_targets(target_object, 
                                                  remove_repeats=True)
        new_dataset.extend(augmented_targets)
    new_dataset = TargetTextCollection(new_dataset)
    number_samples = new_dataset.number_targets()
    print(f'The number of samples in the dataset {number_samples}')
    new_dataset.to_json_file(save_fp)