TROVE TOOLS

TroveNewspapers package

Description

A series of tools to aid researchers using the National Library of Australia's Trove newspapers database.

scrape.py -- scraper client for retrieving and extracting data from the Trove newspapers database.
harvest.py -- sets up a bulk download of articles matching a specified search query
harvester.py -- a GUI for setting up and managing harvests
utilities.py -- used to generate lists of available newspaper titles
do_harvest.py -- script for initiating a new harvest
do_totals.py -- script to retrieve total numbers of articles matching a query across time
do_summary.py -- script to retrieve total numbers of articles by state and title + config/harvest.ini -- config file for do_harvest.py + data/titles_by_id.pck + data/titles_by_state.pck

The files in the data directory are probably out of date, so use utilities.py to generate new ones if you need them.

Dependencies:

BeautifulSoup
wxPython (for the GUI)

The TroveNewspapers package is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

The TroveNewspapers package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with the TroveNewspapers package. If not, see http://www.gnu.org/licenses/.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
config		config
data		data
graphs		graphs
html		html
icons		icons
BeautifulSoup.py		BeautifulSoup.py
LICENSE.txt		LICENSE.txt
README-DO_HARVEST.txt		README-DO_HARVEST.txt
README.md		README.md
__init__.py		__init__.py
clean.py		clean.py
do_harvest.py		do_harvest.py
do_summary.py		do_summary.py
do_totals.py		do_totals.py
editorials.py		editorials.py
front_pages.py		front_pages.py
harvest.py		harvest.py
harvester.py		harvester.py
help.py		help.py
icons.py		icons.py
issues.py		issues.py
scrape.py		scrape.py
titles.py		titles.py
utilities.py		utilities.py

License

wragge/Trove-newspapers

Folders and files

Latest commit

History

Repository files navigation

TROVE TOOLS

TroveNewspapers package

Description

Contents:

Dependencies:

About

Resources

License

Stars

Watchers

Forks

Languages