Skip to content

Software for preprocessing textual data in multiple languages for textual analysis.

License

Notifications You must be signed in to change notification settings

ChristopherLucas/txtorg

Repository files navigation

txtorg

txtorg is a Python-based utility that leverages Apache Lucene to facilitate text preprocessing and management. It outputs processed text in a variety of formats for use in a wide array of analytical software, including (but not limited to) the structural topic model. It scales to large corpora and has a graphical user interface that anyone can use. With Lucene, txtorg can support a wide range of languages.

For more information, including installation instructions, see http://txtorg.org/.

About

Software for preprocessing textual data in multiple languages for textual analysis.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages