Python Stock Exemples

Langage de programmation: Python

Espace de nommage/Pack: datastorage

Class/Type: Stock

Exemples au hotexamples.com: 5

Python Stock - 5 exemples trouvés. Ce sont les exemples réels les mieux notés de datastorage.Stock extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

url(2)

Stock(2)

visit(2)

checksum(1)

count(1)

exist_url(1)

save_data(1)

update(1)

Méthodes fréquemment utilisées

url (2)

Stock (2)

visit (2)

checksum (1)

count (1)

exist_url (1)

save_data (1)

update (1)

Associées

Write

TextApi

vtkImageLabelChange

TectonicRegionBuilder

CommodityLogic

get_equipped_weapons

merge_html_attrs

DBusMonitor

getPathsFromEndpoints

serialize_comment

Related in langs

Author (PHP)

FormRecipient (PHP)

NativeMethods.hidraw_report_descriptor (C#)

IIssueAnnotationFormatter (C#)

ssl_ext_pre_shared_key_parse_clienthello (C++)

is_resource_present (C++)

SendAsync (Go)

NormFloat64 (Go)

FirewallRecord (Java)

UserBusiness (Java)

Exemple #1

0

Afficher le fichier

from datastorage import Stock # Interactua con mongodb db = Stock() site = db.url() while( site ): #db.update(site) site = db.url() print site['url']

Exemple #2

0

Afficher le fichier

Fichier : generator_corpus.py Projet : dmouse/nlp

import re from datastorage import Stock db = Stock() for page in db.visit(): try: page['text'] = u" ".join(page['text'].replace(u"\xa0", u" ").strip().split()) print str(page['_id']) + " " + re.sub(r'[-_\/]',' ',re.sub(r'[^a-zA-Z\-\ ]', '', page['text'].lower() )) except Exception: continue

Exemple #3

0

Afficher le fichier

# remove the css styles p = re.compile(r'< style[^<>]*?>.*?< / style >') data = p.sub('', data) # remove html comments p = re.compile(r'') data = p.sub('', data) # remove all the tags p = re.compile(r'<[^<]*?>') data = p.sub('', data) return data db = Stock() pages = db.visit() for page in pages: try: if (page['html'].__len__() > 100): html = page['html'] else: html = page['text'] clear_html = re.sub('<[^<]+?>', '', html) normalizado = normalize('NFKD', clear_html.decode('utf-8')).encode( 'ASCII', 'ignore').lower() text = re.sub(r'[^a-zA-Z\-\ ]', '', normalizado) text = re.sub(r'[-_\/]|[a-z]{13,}|\W+|[ \t]+', ' ', text) token = text.split()

Exemple #4

0

Afficher le fichier

import time #import nltk # NLP import hashlib from spider import Spider # Clase para visitar los sitios web from datastorage import Stock # Interactua con mongodb from unidecode import unidecode stop = True db = Stock() # instancia para almacenamiento if (not db.count()): db.save_data({'visit':False,'url':''}); while( stop ): break if ( not db.url() ): break site = db.url() # obtenemos una url no visitada url = site['url'] # separo la url m = hashlib.sha1() date = time.strftime("%Y-%m-%d %H:%m") print "[ Visit ] " + url response = Spider.get_source(url) # obtiene el html de la url if not response : #si no hay respuesta lo marca como visitado

Exemple #5

0

Afficher le fichier

Fichier : sample.py Projet : KOS-mo/nlp

# remove the css styles p = re.compile(r'< style[^<>]*?>.*?< / style >') data = p.sub('', data) # remove html comments p = re.compile(r'') data = p.sub('', data) # remove all the tags p = re.compile(r'<[^<]*?>') data = p.sub('', data) return data db = Stock() pages = db.visit(); for page in pages: try: if (page['html'].__len__() > 100): html = page['html'] else: html = page['text'] clear_html = re.sub('<[^<]+?>','',html) normalizado = normalize('NFKD',clear_html.decode('utf-8')).encode('ASCII','ignore').lower() text = re.sub(r'[^a-zA-Z\-\ ]','',normalizado) text = re.sub(r'[-_\/]|[a-z]{13,}|\W+|[ \t]+',' ',text)