Python Indexer.build_index Examples

Programming Language: Python

Namespace/Package Name: indexer

Class/Type: Indexer

Method/Function: build_index

Examples at hotexamples.com: 2

Python Indexer.build_index - 2 examples found. These are the top rated real world Python examples of indexer.Indexer.build_index extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

add_new_doc(30)

Indexer(30)

create_index(6)

create_unigram_index(3)

calculate_idf(3)

LoadIndexes(3)

close(3)

dump(3)

coords_to_indices(2)

indices_to_coords(2)

calculationSummerize(2)

add_idf_to_dictionary(2)

add_document(2)

LoadDict(2)

fix_inverted_index(2)

finish(2)

evaluate_input(1)

execute(1)

create_save_indexer_with_relevant_docs(1)

entities_and_small_big(1)

directory(1)

delete_dict_after_saving(1)

create_indexer(1)

create_dirs(1)

create_bulk_index_string(1)

finish_index(1)

CreatInvertedIndex(1)

finish_indexing(1)

get_num_spatial_nodes(1)

tokenize(1)

set_idx_fields(1)

process(1)

keys(1)

isStopword(1)

ignore_extensions(1)

get__lda__(1)

fit(1)

getStemmed(1)

getOr(1)

getAnd(1)

get(1)

generate_local_index(1)

create_block(1)

generate_global_index(1)

compute_tf(1)

createIndex(1)

add_square_Wij(1)

bp_index(1)

batch_get_feat_stacked(1)

after_indexing(1)

Example #1

Show file

File: index.py Project: XJDKC/NUS-CS3245

def build_index(in_dir, out_dict, out_postings):
    """
    build index from documents stored in the input directory,
    then output the dictionary file and postings file
    """

    # initialize the class
    indexer = Indexer(out_dict, out_postings)
    indexer.build_index(in_dir)

    # save to file
    indexer.SavetoFile()

Example #2

Show file

File: searchengine_main.py Project: behrtam/simple_searchengine

SEED_URL = 'http://mysql12.f4.htw-berlin.de/crawl/'
SEED_PAGES = ('d01.html', 'd06.html', 'd08.html')

STOP_WORDS = ['d01', 'd02', 'd03', 'd04', 'd05', 'd06', 'd07', 'd08',  
'a', 'also', 'an', 'and', 'are', 'as', 'at', 'be', 'by', 'do',
'for', 'have', 'is', 'in', 'it', 'of', 'or', 'see', 'so',
'that', 'the', 'this', 'to', 'we']


crawler = Crawler([urljoin(SEED_URL, page) for page in SEED_PAGES])

page_rank = PageRank(crawler.webgraph_in, crawler.webgraph_out)
page_rank.build_graph()

index = Indexer(crawler.contents, STOP_WORDS)
index.build_index()

scorer = Scorer(index)

print("> SIMPLE SEARCH ENGINE (by Tammo, Tim & Flo)")

while True:
    scores = scorer.calculate_scores(input("\n> query: "))

    if not scores:
        print("your search term does not occur on any page")
        continue

    ranked_scores = [(url, score, page_rank.get_rank(url), score * page_rank.get_rank(url)) for url, score in scores.items()]
    
    print("\n               url | score  | rank   | rank * score\n" + "-" * 54)