Download a website from just an entry link!

Supports

multi-thread based
downloaded files will not be downloaded again if re-run
filters

Filters

Filters is based on regex basically. There are two kind of filters. One is white filter, and the other is black. White filter means if the filter match the url, the url will be downloaded. The black filter do the opposite. Besides regex filter, meta-filters({image}, {javascript}, {css}) are implemented right now. It presents image resource, including jpeg, gif and png resource.

Usage

change main() function to suite your own requirement, such as downloading Lua references from lua.org, using corret filter you can download the reference perfectly -- see example below :P

Example

download <<programming in Lua>> online documents

add following lines to main()

store.add_white_filter("www\.lua\.org\/pil\/", "{image}", "{css}")
store.put(Job("http://www.lua.org/pil/index.html"))

also, you can set where the files will be stored

download_path = '/root/lua-book'

We apply these filters for the following reasons.

the documents are under the path of www.lua.org/pil/
we also need the image for pretty which may not be under www.lua.org/pil/
lua.css is not under www.lua.org/pil, but we want that for displaying the index.html correctly

Extending & hack

By default, we use SaveFileProcesser to save file to local disk, but you can also do other things: such as analyze html content and extract the info you care about. You can inherit Processer and replace Downloader.processer to do so

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
book_builder		book_builder
processer		processer
README.md		README.md
__init__.py		__init__.py
gwd.py		gwd.py
job.py		job.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

book_builder

book_builder

processer

processer

README.md

README.md

init.py

init.py

gwd.py

gwd.py

job.py

job.py

Repository files navigation

Download a website from just an entry link!

Supports

Filters

Usage

Example

Extending & hack

About

Releases

Packages

Languages

Virtual-Earth/g-web-downloader

Folders and files

Latest commit

History

Repository files navigation

Download a website from just an entry link!

Supports

Filters

Usage

Example

Extending & hack

About

Resources

Stars

Watchers

Forks

Languages