Python FAISSDocumentStore.save 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: haystack.document_store.faiss

클래스/타입: FAISSDocumentStore

메소드/함수: save

hotexamples.com에서의 예제들: 2

Python FAISSDocumentStore.save - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 haystack.document_store.faiss.FAISSDocumentStore.save에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

FAISSDocumentStore(23)

write_documents(11)

update_embeddings(7)

delete_all_documents(6)

load(4)

train_index(3)

get_all_documents(2)

save(2)

예제 #1

파일 보기

파일: test_faiss_and_milvus.py 프로젝트: venuraja79/haystack

def test_faiss_index_save_and_load(tmp_path):
    document_store = FAISSDocumentStore(
        sql_url=f"sqlite:////{tmp_path/'haystack_test.db'}",
        index="haystack_test",
    )
    document_store.write_documents(DOCUMENTS)

    # test saving the index
    document_store.save(tmp_path / "haystack_test_faiss")

    # clear existing faiss_index
    document_store.faiss_indexes[document_store.index].reset()

    # test faiss index is cleared
    assert document_store.faiss_indexes[document_store.index].ntotal == 0

    # test loading the index
    new_document_store = FAISSDocumentStore.load(
        sql_url=f"sqlite:////{tmp_path/'haystack_test.db'}",
        faiss_file_path=tmp_path / "haystack_test_faiss",
        index=document_store.index)

    # check faiss index is restored
    assert new_document_store.faiss_indexes[
        document_store.index].ntotal == len(DOCUMENTS)
    # check if documents are restored
    assert len(new_document_store.get_all_documents()) == len(DOCUMENTS)

예제 #2

파일 보기

파일: preprocessFAQ.py 프로젝트: ewgdg/my-page

#                                   max_seq_len_passage=256,
#                                   batch_size=16,
#                                   use_gpu=True,
#                                   embed_title=True,
#                                   use_fast_tokenizers=True)


# Get dataframe with columns "question", "answer" and some custom metadata
df = pd.read_csv("faq.csv")
# Minimal cleaning
df.fillna(value="", inplace=True)
df["question"] = df["question"].apply(lambda x: x.strip())
print(df.head())

# Get embeddings for our questions from the FAQs
# questions = list(df["question"].values)
# df["question_emb"] = retriever2.embed_queries(texts=questions)
# text is the field to be converted to embeddings
df = df.rename(columns={"question": "text"})

# Convert Dataframe to list of dicts and index them in our DocumentStore
docs_to_index = df.to_dict(orient="records")
document_store.delete_all_documents()
document_store.write_documents(docs_to_index)


retriever2 = EmbeddingRetriever(
    document_store=document_store, embedding_model="sentence_bert-saved", use_gpu=False)
document_store.update_embeddings(retriever2)
document_store.save('faiss2')