teco

The tool that finds N most frequently occurring terms in all files containing text.

A term is a sequence of words (made of alphanumeric characters, as defined by the isalpha() function) separated by spaces.

A text is a sequence of words separated by non-alphanumeric characters.

A term occurs in a text if all of its words are found in the text in the same sequence separated by any number of space-like characters (as defined by the isspace() function), but not paragraph breaks (a newline followed by another newline, "\n\n"). The comparison of words is case-insensitive.

If several different terms overlap in a text, an occurrence is counted for all of the overlapping terms. However, a term that overlaps with itself should not be double-counted.

For example, the term "Computer science" occurs in a text "computer science", "computer science" and "computer\nscience" but neither in "computer, science" nor in "computer\n\nscience".

Installation

$ pip install git+https://github.com/AlexLisovoy/teco.git

Usage

$ teco path_to_terms.txx folder_with_text

Run tests

Use:

$ make test

to run tests. The command at first will run the flake8 tool.

On flake8 success the tests will be run.

Tests coverage

Use:

$ make cov

to run tests and collect coverage information. Once the command finishes it will output link to index page.

License

teco is offered under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
teco		teco
.coveragerc		.coveragerc
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements-dev.txt		requirements-dev.txt
setup.py		setup.py
tests.py		tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

teco

teco

.coveragerc

.coveragerc

.gitignore

.gitignore

LICENSE

LICENSE

Makefile

Makefile

README.md

README.md

requirements-dev.txt

requirements-dev.txt

setup.py

setup.py

tests.py

tests.py

Repository files navigation

teco

Installation

Usage

Run tests

Tests coverage

License

About

Releases

Packages

Languages

License

AlexLisovoy/teco

Folders and files

Latest commit

History

Repository files navigation

teco

Installation

Usage

Run tests

Tests coverage

License

About

Resources

License

Stars

Watchers

Forks

Languages