Skip to content

msnoigrs/python-robotstxt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

robotstxt

A robots.txt manipulation library for Python.

Features

  • Wildcard matching (* and $)
  • Support for Sitemaps
  • Support for Crawl-delay

Usage

from robotstxt import parse, TestAgent

testdata = ['User-agent: Googlebot',
            'Disallow: /',
            'User-agent: *',
            'Disallow: /',
            'Allow: /allow.html',
            'Sitemap: https://www.example.com/sitemap.xml']
robotstxt = parse(testdata)
agent = TestAgent('https://www.example.com/', robotstxt)
result = agent.can_fetch('*', '/allow.html')

About

A robots.txt manipulation library for Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages