Skip to content

encukou/tipi

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tipi

image

Tipi is for typographic replacements in HTML.

Status: ACTIVE

Under active development and maintenance.

Ideas behind this project

  • Input is HTML code, output is the same HTML code with changes in typography (entities, spaces, quotes, etc.).
  • You can't parse HTML with regex.
  • The best existing HTML parser and tokenizer for Python is lxml.
  • There are more languages than English in the world. Each of them has different typographic rules.

Installation

Easy:

$ pip install tipi

Quickstart

Usage of tipi is very straightforward:

>>> from tipi import tipi
>>> html = '<p>"Zavolej mi na číslo <strong class="tel">765-876-888</strong>," řekla, a zmizela...</p>"'
>>> html = tipi(html, lang='cs')
>>> html
u'<p>\u201eZavolej mi na \u010d\xed\xadslo <strong class="tel">765\u2013876\u2013888</strong>,\u201c \u0159ekla, a\xa0zmizela\u2026</p>'
>>> print html
<p>Zavolej mi na čí­slo <strong class="tel">765876888</strong>,“ řekla, a zmizela</p>

Remember that tipi is designed to work with HTML. In case you need to perform replacements on plaintext, escape it first:

>>> fron tipi import tipi
>>> tipi('b -> c')  # this works only by coincidence!
u'b → c'
>>> tipi('a <- b -> c')
u'a  c'
>>> import cgi
>>> html = cgi.escape(u'a <- b -> c')
>>> html
u'a &lt;- b -&gt; c'
>>> tipi(html)
u'a ← b → c'

Features

  • Support for multiple languages.
  • Language-sensitive replacements for single quotes and double quotes.
  • Ellipsis, dashes, nonbreakable spaces, ...
  • Arrows (--> turned into → ), dimensions (12 × 30).
  • Symbols (trademark, registered, copyright, EUR, ...)

Alternatives

Plans

License: MIT

© 2013 Jan Javorek <mail@honzajavorek.cz>

This work is licensed under MIT license.

About

Typographic replacements in HTML

Resources

License

Stars

Watchers

Forks

Packages

No packages published