Python unicat Exemples

Langage de programmation: Python

Espace de nommage/Pack: unicodedata

Méthode/Fonction: unicat

Exemples au hotexamples.com: 5

Python unicat - 5 exemples trouvés. Ce sont les exemples réels les mieux notés de unicodedata.unicat extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Exemple #1

0

Afficher le fichier

def clean_tagged_text(self, tagged_text): """ Remove punctuation from tagged text. """ punct_tagged = lambda word: all( unicat(char).startswith("P") and char != "," for char in word) cleaned = filter(lambda t: not punct_tagged(t[0]), tagged_text) return list(cleaned)

Exemple #2

0

Afficher le fichier

Fichier : candidates.py Projet : TinaCloud/atap

def normalize(sent): is_punct = lambda word: all(unicat(c).startswith('P') for c in word) # Removes punctuation from a tokenized/tagged sentence and lowercases. sent = filter(lambda t: not is_punct(t[0]), sent) sent = map(lambda t: (t[0].lower(), t[1]), sent) return list(sent)

Exemple #3

0

Afficher le fichier

Fichier : transformers.py Projet : yokeyong/atap

def normalize(self, sent): """ Removes punctuation from a tokenized/tagged sentence and lowercases words. """ is_punct = lambda word: all(unicat(char).startswith('P') for char in word) sent = filter(lambda t: not is_punct(t[0]), sent) sent = map(lambda t: (t[0].lower(), t[1]), sent) return list(sent)

Exemple #4

0

Afficher le fichier

Fichier : ner.py Projet : nitimkc/cyberbullying

def normalize(sent): """ Removes punctuation from a tokenized/tagged sentence and lowercases words. """ sent = tweet_tokenizer.tokenize(sent) sent = [x for x in sent if not 'http' in x] is_punct = lambda word: all(unicat(char).startswith('P') for char in word) sent = filter(lambda t: not is_punct(t[0]), sent) # sent = map(lambda t: (t[0].lower(), t[1]), sent) sent = map(lambda t: t.lower(), sent) return list(sent)

Exemple #5

0

Afficher le fichier

Fichier : email_topic_modeler.py Projet : zzara/ShellUtils

def normalize(self, sent): """ Removes punctuation from a tokenized/tagged sentence and lowercases words. """ is_punct = lambda word: all(unicat(char).startswith('P') for char in word) sent = filter(lambda t: not is_punct(t[0]), sent) sent = list(sent) if len(sent) == 2: sent = map(lambda t: (t[0].lower(), t[1]), [sent]) sent = list(sent) else: sent = list() return sent