Skip to content

pranav7/docsearch-scraper

 
 

Repository files navigation

DocSearch scraper

This repository holds the code of the DocSearch crawler used to power the hosted version of DocSearch.

If you're looking for a way to add DocSearch to your site, the easiest solution is to apply to DocSearch. To run the crawler yourself, you're at the right place.

Installation and Usage

Please check the dedicated documentation to see how you can install and run DocSearch yourself.

This project supports Python 3.7+

Related projects

DocSearch is made of 3 repositories:

  • algolia/DocSearch contains the docsearch.js code source and the documentation website.
  • algolia/docsearch-configs contains the JSON files representing all the configs for all the documentations DocSearch is powering
  • algolia/docsearch-scraper contains the crawler we use to extract data from your documentation. The code is open source and you can run it from a Docker image

Packages

No packages published

Languages

  • Python 96.2%
  • HTML 3.5%
  • Other 0.3%