Python Indexer.evaluate_input Examples

Programming Language: Python

Namespace/Package Name: indexer

Class/Type: Indexer

Method/Function: evaluate_input

Examples at hotexamples.com: 1

Python Indexer.evaluate_input - 1 examples found. These are the top rated real world Python examples of indexer.Indexer.evaluate_input extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

add_new_doc(30)

Indexer(30)

create_index(6)

create_unigram_index(3)

calculate_idf(3)

LoadIndexes(3)

close(3)

dump(3)

coords_to_indices(2)

indices_to_coords(2)

calculationSummerize(2)

add_idf_to_dictionary(2)

add_document(2)

LoadDict(2)

fix_inverted_index(2)

finish(2)

evaluate_input(1)

execute(1)

create_save_indexer_with_relevant_docs(1)

entities_and_small_big(1)

directory(1)

delete_dict_after_saving(1)

create_indexer(1)

create_dirs(1)

create_bulk_index_string(1)

finish_index(1)

CreatInvertedIndex(1)

finish_indexing(1)

get_num_spatial_nodes(1)

tokenize(1)

set_idx_fields(1)

process(1)

keys(1)

isStopword(1)

ignore_extensions(1)

get__lda__(1)

fit(1)

getStemmed(1)

getOr(1)

getAnd(1)

get(1)

generate_local_index(1)

create_block(1)

generate_global_index(1)

compute_tf(1)

createIndex(1)

add_square_Wij(1)

bp_index(1)

batch_get_feat_stacked(1)

after_indexing(1)

Example #1

Show file

                for term in tokens:
                    vocabulary.add(term)
            document = Document(i, paragraphs)
            documents.append(document)

    # Length of vocabulary
    vocabularyLength = len(vocabulary)
    # print(vocabularyLength)
    
    # Creating the inverted index
    indexer = Indexer(documents)

    # Take filename as input for processing
    inputDocument = str(sys.argv[1])
    # inputDocument = 'test.txt'
    raw = None
    with open(inputDocument, encoding="utf8", errors="ignore") as input_file:
        raw = input_file.read()
    paras = paragraph_tokenizer(raw)
    paragraphs = []
    for j, para in enumerate(paras):
        tokens = preprocessor(para)
        _id = (-1, j) # -1 so that it is different from other documents in the corpus
        paragraph = Paragraph(_id, tokens)
        paragraphs.append(paragraph)
    input_doc = Document(-1, paragraphs)
    top_k, uniqueness = indexer.evaluate_input(input_doc, files, 10)
    for i in range(len(top_k)):
        score, filename = top_k[i]
        print('Document: ' + str(filename), 'Score: ' + str(score))
    print('Uniqueness: ' + str(uniqueness) + ' %')