Skip to content

rlugojr/surt

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sort-friendly URI Reordering Transform (SURT) python package.

Usage:

>>> from surt import surt
>>> surt("http://archive.org/goo/?a=2&b&a=1")
'org,archive)/goo?a=1&a=2&b'

Installation:

pip install surt

Or install the dev version from git:

pip install git+https://github.com/internetarchive/surt.git#egg=surt

More information about SURTs: http://crawler.archive.org/articles/user\_manual/glossary.html#surt

This is mostly a python port of the webarchive-commons org.archive.url package. The original java version of the org.archive.url package is here: https://github.com/iipc/webarchive-commons/tree/master/src/main/java/org/archive/url

This module depends on the tldextract module to query the Public Suffix List. tldextract can be installed via pip

Build Status

About

Sort-friendly URI Reordering Transform (SURT) python module

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%