Skip to content
This repository has been archived by the owner on Mar 17, 2023. It is now read-only.

javiergarea/inquiry

Repository files navigation

Inquiry is a search engine of electronic preprints.

For now, you can access to papers in the fields of mathematics, physics, computer science and statistics. These papers are retrieved from the arXiv repository.

Prerequisites

In order to use Inquiry, we assume you have met the following requirements:

  • python>=3.x
  • elasticsearch==7.4.0
  • gcc
  • Poppler cpp lib

Installing Inquiry

To install Inquiry, follow these steps:

  • Clone this repository:

    $ git clone https://github.com/javiergarea/inquiry.git
    
  • Run the following command to install the project dependencies:

    $ pip3 install -r requirements.txt
    

    If something goes wrong during this step, ensure you have installed pip, gcc and popplerlib.

Running Inquiry

  1. Run the arXiv spider in order to crawl the documents:

    $ scrapy crawl arxiv
    

    This should generate an items.jsonl file in the root directory.

  2. Start the Elasticsearch service:

    $ elasticsearch
    

    Check that is running properly by running the command curl localhost:9200.

  3. Index the crawled data in Elasticsearch:

    $ python3 elastic_manage.py -i items.jsonl
    
  4. Run the Inquiry service:

    $ python3 manage.py runserver
    
  5. Access to localhost:8000 and perform your queries.

Documentation

Inquiry is an Information Retrieval project. This project has been developed as part of the MSc. in Computer Science at Universidade da Coruña. The software is accompanied by a technical document which details its development. This document is available in web version.

Authors

Javier Garea - javier.garea@udc.es

Martín Sande - martin.sande@udc.es

License

This project uses the following license: MIT.

About

🔎 Inquiry is a search engine of electronic preprints.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages