Python make_corpus Exemples

Langage de programmation: Python

Espace de nommage/Pack: corpkit

Méthode/Fonction: make_corpus

Exemples au hotexamples.com: 2

Python make_corpus - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de corpkit.make_corpus extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Associées

RequestAdapter

VirtualDeviceConfigSpec

loadLocalCaps

ErbBloomFilter

plain_check

post_save_media_file_update

register

guiSuite

extract_parents

reverse_word

Related in langs

Event (PHP)

print_tag_cloud (PHP)

PooledStreamStack (C#)

AddDecimalRangeParameter (C#)

auth_mod_getpass (C++)

NetLocalGroupAddMembers (C++)

NewForConfig (Go)

NewClient (Go)

FirstPrimeModel (Java)

UserTableModel (Java)

Exemple #1

0

Afficher le fichier

Fichier : corpus.py Projet : kareem180/corpkit

def parse(self, corenlppath = False, operations = False, copula_head = True, speaker_segmentation = False, memory_mb = False, *args, **kwargs): """ Parse an unparsed corpus, saving to disk >>> parsed = corpus.parse(speaker_segmentation = True) :param corenlppath: folder containing corenlp jar files :type corenlppath: str :param operations: which kinds of annotations to do :type operations: str :param speaker_segmentation: add speaker name to parser output if your corpus is script-like: :type speaker_segmentation: bool :param memory_mb: Amount of memory in MB for parser :type memory_mb: int :param copula_head: Make copula head in dependency parse :type copula_head: bool :returns: The newly created :class:`corpkit.corpus.Corpus` """ from corpkit import make_corpus from corpkit.corpus import Corpus #from corpkit.process import determine_datatype #dtype, singlefile = determine_datatype(self.path) if self.datatype != 'plaintext': raise ValueError('parse method can only be used on plaintext corpora.') kwargs.pop('parse', None) kwargs.pop('tokenise', None) return Corpus(make_corpus(self.path, parse = True, tokenise = False, corenlppath = corenlppath, operations = operations, copula_head = copula_head, speaker_segmentation = speaker_segmentation, memory_mb = memory_mb, *args, **kwargs))

Exemple #2

0

Afficher le fichier

Fichier : corpus.py Projet : kareem180/corpkit

def tokenise(self, *args, **kwargs): """ Tokenise a plaintext corpus, saving to disk >>> tok = corpus.tokenise() :param nltk_data_path: path to tokeniser if not found automatically :type nltk_data_path: str :returns: The newly created :class:`corpkit.corpus.Corpus` """ from corpkit import make_corpus from corpkit.corpus import Corpus #from corpkit.process import determine_datatype #dtype, singlefile = determine_datatype(self.path) if self.datatype != 'plaintext': raise ValueError('parse method can only be used on plaintext corpora.') kwargs.pop('parse', None) kwargs.pop('tokenise', None) return Corpus(make_corpus(self.path, parse = False, tokenise = True, *args, **kwargs))