To get the matrix of frequencies of every linguistic parameter of every text in the corpus (corpus_3.txt
) run the file md_analyser.py
. The program uses already morphologically annotated (by RFTagger) corpus (processed_corpus_3.xml
). You can also use the table with the annotation in the system of the FTDs for further experiments (annotation_corpus3.csv
).
-
Notifications
You must be signed in to change notification settings - Fork 0
Linguistic feature extractor for Russian text annotated with RFTagger. Extracted features were used for genre classification.
License
Askinkaty/MDRus_analyser
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Linguistic feature extractor for Russian text annotated with RFTagger. Extracted features were used for genre classification.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published