Skip to content

simar0at/kontext

 
 

Repository files navigation

KonText

Introduction

This program started as a fork of the Bonito 2.68 web interface to the corpus management tool Manatee. It is maintained by the Institute of the Czech National Corpus. Current version contains all the key features of the Bonito 2.98.3 (primarily a support for parallel corpora).

Features

  • code-level changes
    • rewritten as a WSGI application (Bonito2 is CGI-based)
    • modular code design with dynamically loadable plug-ins providing custom functionality implementation
    • background concordance calculation based on Python's high-level multiprocessing package
    • completely rewritten client-side code (AMD modules, code separated from templates)
    • improved logging, error processing and debugging support
    • improved code documentation
  • new features
    • added support for spoken corpora - defined segments can be played back as audio
    • persistent links for large queries - you can send a link to someone even if the query was in megabytes
    • access to previous queries
    • interactive subcorpus selection - you can select text types and see how other attributes' available values changed
    • interactive PoS tag tool - in case of positional PoS tag formats an interactive tool can be used to write tag queries
    • a concordance can be saved in Excel format (xlsx)
  • enhanced user interface
    • improved user interface and design
    • extended corpora information (size, structures, attributes, citation information)
    • concordance results contain also the Average Reduced Frequency
    • sub-corpus can be created by a custom CQL expression
    • on the multilevel frequency distribution page, starting word can be specified for multi-word KWICs
    • result shuffling can be pre-set

Requirements

  • a WSGI-compatible server
  • Python 2.6 or 2.7
    • lxml library
    • werkzeug library (provides WSGI middleware)
    • PyICU library (optional but preferred)
    • markdown library (optional, for formatted corpora references)
    • openpyxl library (optional, for XLSX export)
  • corpus search engine Manatee, version 2.83.x (2.107.1 is not supported yet)
  • a key-value storage
    • any custom implementation (Redis and SQLite backends are available by default)

Installation

Please refer to the INSTALL.md file.

Customization and contribution

Please refer to the DEVELOPMENT.md file.

About

An alternative web front-end for the Manatee corpus search engine

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 56.6%
  • JavaScript 21.6%
  • TypeScript 16.9%
  • CSS 4.8%
  • HTML 0.1%