Bandersnatch Git Mirror (do not use)
License
ianw/bandersnatch
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is a PyPI mirror client according to `PEP 381 <http://www.python.org/dev/peps/pep-0381/>`_. .. contents:: Build status ============ bandersnatch .. image:: https://builds.flyingcircus.io/job/bandersnatch/badge/icon :target: https://builds.flyingcircus.io/job/bandersnatch/ Packaging and PIP install .. image:: https://builds.flyingcircus.io/job/bandersnatch-packaging-pip/badge/icon :target: https://builds.flyingcircus.io/job/bandersnatch-packaging-pip/ Installation ============ The following instructions will place the bandersnatch executable in a virtualenv under ``bandersnatch/bin/bandersnatch``. .. note:: bandersnatch requires Python 2.7 or 2.6. pip --- This installs the latest stable, released version. :: $ virtualenv-2.7 bandersnatch $ cd bandersnatch $ bin/pip install -r https://bitbucket.org/pypa/bandersnatch/raw/stable/requirements.txt zc.buildout ----------- This installs the current development version. Use 'hg up <version>' and run buildout again to choose a specific release. :: $ hg clone https://bitbucket.org/pypa/bandersnatch $ cd bandersnatch $ virtualenv-2.7 . $ bin/python bootstrap.py $ bin/buildout Configuration ============= * Run ``bandersnatch mirror`` - it will create an empty configuration file for you in ``/etc/bandersnatch.conf``. * Review ``/etc/bandersnatch.conf`` and adapt to your needs. * Run ``bandersnatch mirror`` again. It will populate your mirror with the current status of all PyPI packages - roughly 120GiB (2014-10-23). Expect this to grow substantially over time. * Run ``bandersnatch mirror`` regularly to update your mirror with any intermediate changes. Webserver --------- Configure your webserver to serve the ``web/`` sub-directory of the mirror. For nginx it should look something like this:: server { listen 127.0.0.1:80; server_name <mymirrorname>; root <path-to-mirror>/web; autoindex on; charset utf-8; } * Note that it is a good idea to have your webserver publish the HTML index files correctly with UTF-8 as the carset. The index pages will work without it but if humans look at the pages the characters will end up looking funny. * Make sure that the webserver uses UTF-8 to look up unicode path names. nginx gets this right by default - not sure about others. Cron jobs --------- You need to set up one cron job to run the mirror itself. If you run a public mirror, then you need a second job that will create access statistics for aggregation on the master PyPI. Here's a sample that you could place in ``/etc/cron.d/bandersnatch``:: LC_ALL=en_US.utf8 */2 * * * * root bandersnatch mirror |& logger -t bandersnatch[mirror] 12 * * * * root bandersnatch update-stats |& logger -t bandersnatch[update-stats] This assumes that you have a ``logger`` utility installed that will convert the output of the commands to syslog entries. Maintenance =========== bandersnatch does not keep much local state in addition to the mirrored data. In general you can just keep rerunning ``bandersnatch mirror`` to make it fix errors. If you delete the state files then the next run will force it to check everything against the master PyPI:: * delete ``./state`` file and ``./todo`` if they exist in your mirror directory * run ``bandersnatch`` mirror to get a full sync Be aware, that full syncs likely take hours depending on PyPIs performance and your network latency and bandwidth. Operational notes ================= Case-sensitive filesystem needed -------------------------------- You need to run bandersnatch on a case-sensitive filesystem. OS X natively does this OK even though the filesystem is not strictly case-sensitive and bandersnatch will work fine when running on OS X. However, tarring a bandersnatch data directory and moving it to, e.g. Linux with a case-sensitive filesystem will lead to inconsistencies. You can fix those by deleting the status files and have bandersnatch run a full check on your data. Many sub-directories needed --------------------------- The PyPI has a quite extensive list of packages that we need to maintain in a flat directory. Filesystems with small limits on the number of sub-directories per directory can run into a problem like this:: 2013-07-09 16:11:33,331 ERROR: Error syncing package: zweb@802449 OSError: [Errno 31] Too many links: '../pypi/web/simple/zweb' Specifically we recommend to avoid using ext3. Ext4 and newer does not have the limitation of 32k sub-directories. Migrating from pep381client =========================== * remove old status files, but keep actual data (everything under ``web/``) * create config file, port command parameters from old cronjobs * update cron jobs Contact ======= If you have questions or comments, please submit a bug report to http://bitbucket.org/pypa/bandersnatch/issues/new. Code of Conduct =============== Everyone interacting in the bandersnatch project's codebases, issue trackers, chat rooms, and mailing lists is expected to follow the `PyPA Code of Conduct`_. .. _PyPA Code of Conduct: https://www.pypa.io/en/latest/code-of-conduct/ Kudos ===== This client is based on the original pep381client by Martin v. Loewis. Richard Jones was very patient answering questions at PyCon 2013 and made the protocol more reliable by implementing some PyPI enhancements.
About
Bandersnatch Git Mirror (do not use)
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published