Skip to content

Web scraper for grabbing the prices of various OpenAccess Scientific Journals

License

Notifications You must be signed in to change notification settings

PatrickSpieker/pricesleuth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pricesleuth

pricesleuth is a pure Python library for acquiring data about OpenAccess journals, including article processing charge (APC) information. It works with a variety of large publishers and also incorporates some smaller, miscellaneous publishers.

Supported Publishers

Currently supported

  • BioMed Central
  • Elsevier
  • Hindawi
  • Springer
  • Wiley

In progress

  • Public Library of Science (PLoS)
  • Sage

Installation

pricesleuth (will be) available on PyPI.

pip install pricesleuth

Usage

All of the journal scrapers are located in pricesleuth.scrapers.journalscrapers. A specific journal's scraper object is formatted like [ScraperNameHere]Scraper.

All scraper objects have a get_entries method, which generates a Python tuple for each successfully scraped journal from that publisher. The tuple is of the form:

(publisher_name, journal_name, date_of_scraping, journal_type, ISSN_of_journal, article_publishing_cost)

  • publisher_name:
  • journal_name:
  • date_of_scraping:
  • journal_type:
  • ISSN_of_journal:
  • article_publishing cost:

(NOTE: not ALL journals from each publisher are able to be generated. Some suffer from web-formatting issues or other problems which prevent efficient scraping)

About

Web scraper for grabbing the prices of various OpenAccess Scientific Journals

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published