Python solr_data_frame Exemples

Langage de programmation: Python

Espace de nommage/Pack: datastores.datastore

Méthode/Fonction: solr_data_frame

Exemples au hotexamples.com: 2

Python solr_data_frame - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de datastores.datastore.solr_data_frame extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Associées

list_directories

addMessageHook

get

enable

dumps

timesincetimestamp

calculateAtomSymmetryNumber

smooth_marginals_2D

cigar_party

flash

Related in langs

DatabaseObject_Helper_Admin_OrderManager (PHP)

Node (PHP)

NLockTemplateOperationsCommon (C#)

CreateDirectCompleteServiceOrderRequestData (C#)

insert_word (C++)

hexdecode (C++)

RunAction (Go)

GetRawOption (Go)

SootUtil (Java)

Exemple #1

0

Afficher le fichier

Fichier : collocations.py Projet : rsteckel/EDA

print t, c, mu, sigma, (mu/sigma) train() collocation('trendy', min_count=1) #---------Use NLTK collocations-------------- import nltk import re import HTMLParser import datastores.datastore as d df = d.solr_data_frame('Beauty_Crawl_RSS_Feeds') h = HTMLParser.HTMLParser() for i,d in enumerate(documents): documents[i] = h.unescape(documents[i]).encode('utf-8') def tokenize(documents): for i,doc in enumerate(documents): if i % 100 == 0: print '%d of %d' % (i, len(documents)) for sent in nltk.sent_tokenize(doc.lower()): for word in nltk.word_tokenize(sent): yield word

Exemple #2

0

Afficher le fichier

Fichier : test_loaddatafromsolr.py Projet : rsteckel/EDA

__author__ = 'sriWork' import datastores.datastore as ds COLLECTION = 'Health_Crawl_RSS_Feeds' #COLLECTION = 'Health_Crawl_RSS_Feeds' FIELDS = [ 'id','title','content,' 'pubDate_dt', 'tags_s', 'lang','author'] QUERY = None CACHE=False ##### Read solr data into 'dataframe' ##### dataframe=[] dataframe = ds.solr_data_frame(COLLECTION, FIELDS, QUERY,CACHE) ##print(dataframe['content'][373]) length_dataframe=len(dataframe) #### Count the number of english and spanish documents and print the other language tags cnt_eng=0 cnt_es=0 for i in range(0,length_dataframe): #content_currentframe=dataframe['content'][i] #print(dataframe['lang'][i]) if dataframe['lang'][i]==[u'en']: cnt_eng=cnt_eng+1 elif dataframe['lang'][i]==[u'es']: cnt_es=cnt_es+1 #else: #print i #print(dataframe['lang'][i]) #print(content_currentframe)