Skip to content

jirian/text_summarizer_czech

 
 

Repository files navigation

text_summarizer_czech

This project implements 3 different article summarizers for the Czech language:

  1. Feature-based summarizer with RBM

    • file: summarizer_features_rbm.py
    • based on this article: https://arxiv.org/pdf/1708.04439.pdf
    • uses a combination of several features extracted from the text and an RBM to enrich those features
  2. Matrix-based TextRank summarizer

  3. Graph-based TextRank summarizer

Usage

Python 3.6+ needed to run the scripts.

Install requirements:

pip install -r requirements.txt

Run one of the 3 summarizers without any arguments to see how it performs on a test set of several articles.

Run one of the 3 summarizers with 1 argument, a path to a file to have it print out a summary of the text in the file. The file has to contain just simple text (no xml, ...).

For example:

python summarizer_nlphackers_textrank.py my_article.txt

Releases

No releases published

Packages

No packages published

Languages

  • HTML 49.1%
  • Python 39.9%
  • Java 11.0%