parslepy

parslepy (pronounced "parsley-pie", /ˈpɑːslipaɪ/) is a Python implementation (built on top of lxml and cssselect) of the Parsley DSL for extracting structured data from web pages, as defined by Kyle Maxwell and Andrew Cantino (see Parsley's wiki for more details and original C implementation).

Kudos to Kyle Maxwell (@fizx) for coming up with this smart and easy syntax to define extracting rules.

Please note that this Parsley DSL is NOT the same as the Parsley parsing library at https://pypi.python.org/pypi/Parsley

Check out the official docs for more information on how to install and use parslepy. There is also some useful information at the parslepy Wiki

Here is an example of a parselet script that extracts the questions from StackOverflow first page:

{
    "first_page_questions(//div[contains(@class,'question-summary')])": [{
        "title": ".//h3/a",
        "tags": "div.tags",
        "votes": "div.votes div.mini-counts",
        "views": "div.views div.mini-counts",
        "answers": "div.status div.mini-counts"
    }]
}

Install

Install via pip with:

sudo pip install parslepy

Alternatively, you can install from the latest source code:

git clone https://github.com/redapple/parslepy.git
sudo python setup.py install

Name		Name	Last commit message	Last commit date
Latest commit History 129 Commits
docs		docs
examples		examples
parslepy		parslepy
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
CHANGELOG		CHANGELOG
LICENSE		LICENSE
MANIFEST		MANIFEST
README.md		README.md
TODO.md		TODO.md
requirements-lxml-2.3.txt		requirements-lxml-2.3.txt
requirements-lxml-3.0.txt		requirements-lxml-3.0.txt
requirements-lxml-3.1.txt		requirements-lxml-3.1.txt
requirements-lxml-latest.txt		requirements-lxml-latest.txt
requirements.txt		requirements.txt
run_parslepy.py		run_parslepy.py
setup.cfg		setup.cfg
setup.py		setup.py

License

pombredanne/parslepy

Folders and files

Latest commit

History

Repository files navigation

parslepy

Install

Online Resources

About

Resources

License

Stars

Watchers

Forks

Languages