Skip to content

tylderen/fulmar

Repository files navigation

fulmar

Documentation Status ci

Fulmar is a distributed crawler system. By using non-blocking network I/O, Fulmar can handle hundreds of open connections at the same time. You can extractthe data you need from websites. In a fast, simple way.

Code example

from fulmar.base_spider import BaseSpider

class Handler(BaseSpider):

   def on_start(self):
      self.crawl('http://www.baidu.com/', callback=self.parse_and_save)

   def parse_and_save(self, response):
      return {
         "url": response.url,
         "title": response.page_lxml.xpath('//title/text()')[0]}

You can save above code in a new file called baidu_spider.py and run command:

fulmar start_project baidu_spider.py

If you have installed redis, you will get:

Successfully start the project, project name: "baidu_spider".

Finally, start Fulmar:

fulmar all

Installation

Automatic installation:

pip install fulmar

Fulmar is listed in PyPI and can be installed with pip or easy_install.

Fulmar source code is hosted on GitHub.

Documentation

Please visit Fulmar Docs.

About

A distributed crawler system.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published