Python Stopwords.Stopwords 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: stopwords

클래스/타입: Stopwords

메소드/함수: Stopwords

hotexamples.com에서의 예제들: 2

Python Stopwords.Stopwords - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 stopwords.Stopwords.Stopwords에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Stopwords(2)

removeStopWords(1)

stopwords(1)

자주 사용되는 메소드들

Stopwords (2)

removeStopWords (1)

stopwords (1)

예제 #1

파일 보기

파일: indexer.py 프로젝트: alexscott64/ics_search_engine

re_matcher = re.compile("^https?://.*ics.uci.edu")


def get_links(html):
    links = []
    soup = BeautifulSoup(html, "html.parser")
    for link in soup.findAll('a', attrs={'href': re_matcher}):
        links.append(link.get('href'))
    return links


def hasdigit(token):
    return any(c.isdigit() for c in token)


stopw = Stopwords()


def check_token(token):
    return not stopw.is_stop(token) and not hasdigit(
        token) and len(token) > 1 and len(token) < 20


def add_token(token):
    pass


nonalphanum = re.compile("[^0-9a-z']")


def tokenize_text(intext):

예제 #2

파일 보기

from zope.component.testing import setUp

from index import Index
from parsers.english import EnglishParser
from splitter import SplitterFactory
from stopwords import Stopwords
from zopyx.txng3.core.interfaces import IParser, IStopwords, IThesaurus
from zopyx.txng3.core.lexicon import LexiconFactory
from zopyx.txng3.core.storage import StorageWithTermFrequencyFactory
from zopyx.txng3.core.thesaurus import GermanThesaurus

# Setup environment
setUp()
provideUtility(SplitterFactory, IFactory, 'txng.splitters.default')
provideUtility(EnglishParser(), IParser, 'txng.parsers.en')
provideUtility(Stopwords(), IStopwords, 'txng.stopwords')
provideUtility(LexiconFactory, IFactory, 'txng.lexicons.default')
provideUtility(StorageWithTermFrequencyFactory, IFactory,
               'txng.storages.default')
provideUtility(GermanThesaurus, IThesaurus, 'txng.thesaurus.de')

try:
    import readline
    histfile = os.path.expanduser('~/.pyhist')
    readline.read_history_file(histfile)
    atexit.register(readline.write_history_file, histfile)
except:
    pass


class Text: