Text Analyzer

Extract data from ElasticSearch
Pipeline performs multiple highly configurable steps to process the data analysis
Post process applies the analysis to ElasticSearchn documents and updates ElasticSearch accordingly

Launches with TextAnalyzerLaunch.py
Reads process instructions from resources.mermtools.ini
Extracts data from ElasticSearch (ES) and converts to Pandas DataFrame
DataFrame enters into processing pipeline.
Pipeline prepares data for analysis (e.g., tokenization and lemmatization). Each function in the pipeline is performed in its own class. Classes with related functions may be in the same script.
Pipeline performs analyzes (e.g., TF-IDF, LDA, k-means etc.)
Post Pipeline processes the analysis results and reorganize ElasticSearch data according to results.

This project is dockerized.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.idea		.idea
resources		resources
tools		tools
ut		ut
.DS_Store		.DS_Store
.gitignore		.gitignore
7.0.0		7.0.0
Dockerfile		Dockerfile
README.md		README.md
TextAnalyzerLaunch.py		TextAnalyzerLaunch.py
confluence_usage_stage.sh		confluence_usage_stage.sh
myustage.sh		myustage.sh
requirements.txt		requirements.txt
stage.sh		stage.sh
testrun.bat		testrun.bat
ut.sh		ut.sh

amerywu/textanalyzer1