Skip to content

LasseKohlmeyer/ma-doc-embeddings

Repository files navigation

lib2vec - A Multi-Faceted Document Embedding Approach

This repository contains code to compute multi-faceted embeddings of lib2vec(often called book2vec within this code basis). Besides, many different experiments, scripts for generating tables as well as plots, and approaches for possibilities not pursued further are contained.

All configurations should be registered in the config.json. Most important are the following three methods:

  • EvaluationUtils.build_corpora(...): creates python representations for given corpora and applies filters. For self-defined corpora, appropriate parsing methods must be created before.

  • EvaluationUtils.train_vecs(...): creates vector representations, for example lib2vec embeddings, to given corpora names.

  • EvaluationUtils.run_evaluation(...): evaluates for the Similarity Tasks.

The d3 directory contains an exploratory visualization of various facets of lib2vec. The necessary file is generated by experiments/embedding_porjection.py.

experiments/book_comparison.py contains experiments for the book comparison task.

experiments/predicting_high_rated_books contains the evaluation for the Scenario predicting high rated books.

The Book Comparison Survey is analyzed by boco_survey/survey_analyses.py and converted to a dataset.

Abstract Image of Multiple Facets

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published