Skip to content

ifrost/droopy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description

Droopy is a little library that could help making text analysis. Analysis is done through text processors which can be:

  • attributes (calculated once, value is cached, parameterless)
  • operators (calculated each time, could have optional parameters)

For example text could have processor named "nof_words" which is an attribute that will return number of words in text. Text could also have processor named "syllables_in_word(word)" - in this situation user must provide a word to count syllables.

Basic usage

Doopy comes with some processors included. Processors are placed in bundles. Bundle is just a set of processors (attributes and operators). Bundles and text that will be processed are linked together in Droopy object.

from droopy import Droopy
text = Droopy("Just a simple test")
text.add_bundles(BundleOne(), BundleTwo())
print text.bundle_one_attribute
print text.bundle_two_operator(some_value)

Droopy Library provides some built-in bundles that you can use.

Built-in bundles

  • Basic Bundle (basic.py)
  • Static Analysis Bundle (static.py)
  • Readability Bundle (readability.py)
  • Filters Bundle (filters.py)
  • Language Bundles Pack (lang module)
    • Polish Language Bundle (lang/polish.py)
    • English Language Bundle (lang/english.py)

Basic Bundle with common processors like alphanumeric chars, digits (Language Bundle required)

Language bundles provides basic information about vowels, consonants, syllables counters, etc.

Static Analysis Bundle provides processors for retrieving informations about characters, words, sentences in processed text.

Filters Bundle is responsible for some cleaning - ex. replacing emoticons in text.

Readability Bundle contains processors with some readability algorithms (read more on Wikipedia)

Each processor could use any other processor from same or other bundle to make calculations. Each processor is calculted only if used.

Example usage of built-in bundles:

from droopy import Droopy
from droopy.static import Static
from droopy.lang.english import English

text = Droopy("Just a simple test")
text.add_bundles(Static(), English())

print text.nof_words # print number of words
print text.nof_syllables # print number of syllables

If you wish to use all available bundles you can use DroopyFactory (factory.py). All you have to do is provide language bundle (or write a language detector bundle):

from droopy.factory import DroopyFactory

text = DroopyFactory.create_full_droopy("Just a simple test.", English())

See also: examples\basic.py

Custom bundles

You can create custom bundles. To do this create a class (it could be plain object) and add processors as methods. Decorate these methods with op (for operators) and attr (for attributes). All methods must pass self and droopy parameter. Within droopy parameter you can access any other processors.

from droopy import attr, op

class MyBundle(object):

    @attr
    def my_attribute(self, droopy):
        return droopy.nof_words * 2 # refer to Static Bundle processors

    @op
    def my_operator(self, droopy, some_value):
        return droopy.my_attribute * some_value            

After this you can use your bundle:

from droopy import Droopy
from mybundles import MyBundle
from droopy.lang.english import English

text = Droopy("Just a simple test")
text.add_bundles(MyBundle(), English())

print text.my_attribute # custom attribute
print text.my_operator(2) # custom operator

See also: examples\custom_bundle.py

About

Python text analysis library

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages