Python texts 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: drug_mentions

메소드/함수: texts

hotexamples.com에서의 예제들: 2

Python texts - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 drug_mentions.texts에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: lda.py 프로젝트: jrwalk/empath

	def streamer():
		for text in texts(drug=drug):
			text = tokenize(text,drug=drug,pos_filter=False)	# list of tokens
			for i,word in enumerate(text):	# remap brand drug names
				remap = _drug_dict.get(word.upper(),None)
				if remap is not None:
					text[i] = remap.lower()
			text = [stemmer.stem(word) for word in text]
			yield text

예제 #2

파일 보기

파일: word_count.py 프로젝트: jrwalk/empath

def word_count(drug=None,limit=None,pos_filter=False,lemma=True):
	"""Scans comment texts (from drug_mentions.texts) for selected drug, 
	calculates most common words.

	KWARGS:
		drug: string or None.
			Drug selector.  Allows three cases:
			* None: scrape all comments in database, regardless of drug.
			* 'antidepressant': select comments speaking generically about
				drug, not referencing specific drug.
			* [drug name]: comments referencing specific drug.
			Default None.  Passed to drug_mentions.texts.
		limit: int or None.
			Optional limit on SQL queries retrieved by drug_mentions.texts. 
			Defaults to None (returns all hits).
		pos_filter: boolean.
			Passed to tokenize(), set True to use part-of-speech filtering.
		lemma: boolean.
			Passed to tokenize(), set True to use lemmatization.

	RETURNS:
		freq: nltk.probability.FreqDist object.
			Frequency distribution of words from comments.

	RAISES:
		ValueError:
			for invalid drug name.
	"""
	try:
		texts = dm.texts(drug=drug,limit=limit)
	except ValueError:
		raise ValueError('Invalid drug name.')

	freq = FreqDist()
	for text in texts:
		freq.update(tokenize(text,drug,pos_filter=pos_filter,lemma=lemma))

	return freq