Nifty

Nifty is a library of utility functions and classes that simplify various common tasks in Python programming - a handy add-on to standard libraries that makes Python even easier to use. In addition, Nifty contains a number of advanced tools for web scraping, data processing and data mining. Brought to you by Marcin Wojnarski (Twitter, LinkedIn). Licensed on GPL.

is...() dynamic type checking: isstring, isint, isnumber, islist, istuple, isdict, istype, isfunction, isiterable, isgenerator, ...
classes and types inspection: classname, issubclass, baseclasses, subclasses, types
objects, generic types with extended interface: Object, NoneObject, ObjDict
collections: unique, flatten, list2str, obj2dict, dict2obj, subdict, splitkeys, lowerkeys, getattrs, setattrs, copyattrs, setdefaults, Heap
strings and text: merge_spaces, ascii, prefix, indent
JSON encoding & serialization of arbitrary objects: JsonObjEncoder, dumpjson, printjson, JsonDict
numbers: minmax, percent, bound, divup, noise, mnoise, parseint
date & time: Timer, now, nowString, utcnow, timestamp, asdatetime, convertTimezone, secondsBetween (minutes, hours, ...), secondsSince (minutes, hours, ...)
files: fileexists, normpath, filesize, filetime, filectime, filedatetime, readfile, writefile, Tee
file folders: normdir, listdir, listdirs, listfiles, findfiles, ifindfiles
concurrency: Lock, NoneLock

Text processing routines in nifty.text:

Levenshtein distance: levenshtein, levendist, levenscore
Bag-of-words model with TF-IDF weights: WordsModel
N-grams: ngrams

Web scraping tools in nifty.redex:

Redex patterns - a new language for extracting data from any markup document. Similar in spirit and structure to regular expressions (regex), but better suited to searching in large tagged documents. Bridges the gap between regex and XPaths as used in web scraping. Combines consistency and compactness of regexes (single pattern matches all document and extracts multiple variables at once) with strength and precision of XPaths: redex pattern is defined in a form much simpler than regexes and can span multiple fragments of the document, providing precise context where each fragment is allowed to match.
parsing of basic data types from human-readable formats used in web pages: pdate, pdatetime, pint, pfloat, pdecimal, percent
url absolutization & unquoting: url, url_unquote

Data Pipes. Architecture for scalable pipeline-oriented processing of Big Data, in nifty.data.pipes.

Data storage and object serialization with a new DAST format, in nifty.data.dast.

For more information, check pydocs and comments in the source code. Other modules to be documented in the near future.

Nifty includes code of Waxeye, a PEG parser generator (MIT license) used to generate parser for Redex.

Use cases

Projects that use Nifty:

Paperity, an aggregator of scholarly literature

Name		Name	Last commit message	Last commit date
Latest commit History 221 Commits
algo		algo
data		data
deep		deep
parsing		parsing
redex		redex
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
db.py		db.py
files.py		files.py
learn.py		learn.py
math.py		math.py
requirements.txt		requirements.txt
special.py		special.py
text.py		text.py
util.py		util.py
varia.py		varia.py
web.py		web.py

License

vbyravarasu/nifty

Folders and files

Latest commit

History

Repository files navigation

Nifty

Contents

Use cases

About

Resources

License

Stars

Watchers

Forks

Languages