Python tokenize Exemples

Langage de programmation: Python

Espace de nommage/Pack: spon.extract.tokenize

Méthode/Fonction: tokenize

Exemples au hotexamples.com: 4

Python tokenize - 4 exemples trouvés. Ce sont les exemples réels les mieux notés de spon.extract.tokenize.tokenize extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Associées

load_style_sheet

run

Note

less_equal

handle_since_firmware

getPlatformName

sniCn

svn_auth_open

get_zip_path

is_longlong

Related in langs

ResourceTypeFactory (PHP)

Comparator (PHP)

TSQLModel (C#)

NetworkReaderPool (C#)

post_receives (C++)

inshuffle (C++)

DecodeNestedRawExtensionOrUnknown (Go)

Filter (Go)

JMenuBar (Java)

JOptionPane (Java)

Exemple #1

0

Afficher le fichier

Fichier : tfidf.py Projet : pudo-attic/spon-scraper

def load_articles(limit): for article in articles.find(_limit=limit): if 'spiegel.de/international' in article['article_url']: continue yield { 'url': article['article_url'], 'text': article['body_text'], 'bigrams': list(make_bigrams(article['body_text'])), 'tokens': list(tokenize(article['body_text'])) }

Exemple #2

0

Afficher le fichier

Fichier : tfidf.py Projet : pombredanne/spon-scraper

def load_articles(limit): for article in articles.find(_limit=limit): if 'spiegel.de/international' in article['article_url']: continue yield { 'url': article['article_url'], 'text': article['body_text'], 'bigrams': list(make_bigrams(article['body_text'])), 'tokens': list(tokenize(article['body_text'])) }

Exemple #3

0

Afficher le fichier

Fichier : tfidf.py Projet : pudo-attic/spon-scraper

def article_terms(model, article): terms = defaultdict(int) for token in tokenize(article['body_text']): terms[token] += 1 total = float(sum(terms.values())) if total == 0: return [] max_f = max(terms.values())/total #print "MAX", max_f, max(terms.values()), terms.values() tf_idfs = {} for term, count in terms.items(): tf = 0.5 + ((0.5*(count/total))/max_f) tf_idfs[term] = tf * model['terms'].get(term, 0) return sorted(tf_idfs.items(), key=lambda (a, b): b, reverse=True)

Exemple #4

0

Afficher le fichier

Fichier : tfidf.py Projet : pombredanne/spon-scraper

def article_terms(model, article): terms = defaultdict(int) for token in tokenize(article['body_text']): terms[token] += 1 total = float(sum(terms.values())) if total == 0: return [] max_f = max(terms.values()) / total #print "MAX", max_f, max(terms.values()), terms.values() tf_idfs = {} for term, count in terms.items(): tf = 0.5 + ((0.5 * (count / total)) / max_f) tf_idfs[term] = tf * model['terms'].get(term, 0) return sorted(tf_idfs.items(), key=lambda (a, b): b, reverse=True)