Common utilities for CSCI E-29 psets
A common practice is to keep work for every problem you solve in its own repo or python package, especially if you plan on reusing it across projects.
However, this can be burdensome if you need a new package for every little thing. Here, we will explore a compromise paradigm that will simultaneously give us isolation and repeatability without creating needless boilerplate.
Table of Contents generated with DocToc
- Preface
- Problems (55 points)
DO NOT CLONE THIS REPO LOCALLY YET. We will manually create a repo and link it. If you have cloned this repo locally, simply delete it (it's fine if it's already forked on github).
We will leverage the templating system CookieCutter to give us a head start on best python practices. Please see the docs for installation.
On Mac with Homebrew:
brew install cookiecutter
Cookiecutter has many template projects for various systems, and they are worth
exploring. For now, we'll use cookiecutter-pylibrary
.
Cookiecutter uses the templating language
Jinja2. When you see something like My name is {{ name }}
it means it will be rendered using the variable name
when
the project is created.
Refer to the cookiecutter docs for additional instructions.
For a package/library, it is especially important that your branching workflow reflect the 'production' status of your library.
After your initial tag commit of this repo, you must conform to a formal git workflow. This means:
- Pick Git Flow or
the simplified Github Flow
or some similar variant.
- Git Flow has a 'development' branch that is good for quick iterative work, but is slightly more complicated otherwise. Sourcetree has built in tools for using it, including automatic tagging.
- Git Flow will help automate tagging
- Github Flow means everything is a branch off master.
- Your
master
branch should be merge-only. That is, never commit work to it directly, only merge fromfeature/*
,develop
,hotfix/*
, or similar - Each new merged commit on master must have a
Semantic Versioning release version with an
accompanying tag. TL;DR:
major.minor.patch
- Patch is for bugfix
- Minor is for new features
- Major is for backwards-incompatible changes
- Don't worry about high version numbers
- tags should be of the form
v0.1.2
- Your work will be graded on your latest tagged version, which should be
the same as your
master
For an ongoing library, we'd like to use cookiecutter-pylibrary as our templated "Encapsulated Best Practices."
Please read through the two linked posts at the top of that page to better understand why we chose this template.
Try rendering out a library with a few different defaults and see what it generates:
cookiecutter gh:ionelmc/cookiecutter-pylibrary
You may delete them when you're done.
Inspect cookiecutter.json
in the pylibrary repo. This contains all the
defaults and variables for your project template. Note that the default value
from a list is the first element; when we clone and modify cookiecutter
templates later, you can reorder as you wish.
When you're ready to render your utils repo, please use the following settings.
Please put in your name, email, etc. as prompted, and use the table below for the configurations that we should all have in common. For anything not in this table, just leave the defaults as-is. But, since these are private repos, we need to turn off all external integrations other than travis:
Param | Value |
---|---|
name/email etc | yours |
repo_username | csci-e-29 (since the repo lives in the org) |
project_name | CSCI Utils |
repo_name | 2019fa-csci-utils-YOUR_GITHUB_ID |
package_name | (should default to csci_utils ) |
Select license | no |
Select test_runner | pytest |
setup_py_uses_setuptools_scm | yes |
pypi_badge | no |
pypi_disable_upload | yes |
allow_tests_inside_package | yes |
Select command_line_interface | argparse |
travis | yes |
codecov, coveralls, appveyor, requiresio, etc | no |
Select codeclimate | yes |
Cookiecutter prints these instructions for linking to this repo. Don't push yet!
cd 2019sp-csci-utils-<YOUR_GITHUB_ID>
git init
git add --all # though nicer if you do this manually/via SourceTree
git commit -m "Add initial project skeleton."
git remote add origin git@github.com:csci-e-29/2019fa-csci-utils-<YOUR_GITHUB_ID>.git
# Hold on!
# git push -u origin master
To help limit builds, let's cut down the test matrix before you push to github.
- Remove or clear the LICENSE file, if present
- In setup.py:
- comment out links to ReadTheDocs
python_requires='>=3.6',
- Erase contents of the file
tests/test_csci_utils.py
(You can either add other tests there, or removetests
fromtestpaths
in setup.cfg) - Remove python 2 and versions below 3.6 from tox.ini, and setup.py, and .travis.yml. When Travis runs, it should only have 2 build steps for 3.6 and 3.7
- In setup.cfg add
--cov=csci_utils
and--cov-branch
toaddopts
- In .travis.yml:
- Comment out/remove the top level env matrix including
TOXENV=docs
andTOXENV=check
. You can add them back in later, but they are overly sensitive for now. - Set the default python to 3.7 by adding
python: 3.7
as the second line, belowlanguage=python
- Change
pip install tox
topip install tox-travis
in the install stanza
- Comment out/remove the top level env matrix including
Commit changes from above to your local master branch.
Because this repo isn't empty, we need to merge the remote origin locally before pushing. In SourceTree, you'll notice two distinct histories. Merge them manually or:
git fetch
git merge origin/master --allow-unrelated-histories
# Note this will ask you to save a commit message. If you're unfamiliar with
# vim, you may need to type ':wq' or Esc then `:wq`
git push -u origin master
You should now see this README as well as your template in github! You'll need to take these steps for all CookieCutter template renders to sync them up manually with remote pset repos.
Now that you've merged the remote branch, you can add your build badges to this README and correct the build badges in README.rst
(If you touched this README first, you may have had merge conflicts before.)
The default project template relies on travis and tox for repeatable testing of a matrix of libraries, which is nice. For local development though, we'll want to create a library development app.
Don't commit the changes until you successfully run pipenv install -e .
below
-
Copy over the
Dockerfile
,docker-compose.yml
, anddrun_app
files from Pset 1 -
In the
Dockerfile
:-
Comment out all the lines pertaining Pipfile etc below
WORKDIR ...
-
Add the following below
WORKDIR
:WORKDIR ... COPY setup.py . COPY src/csci_utils/__init__.py src/csci_utils/__init__.py # COPY Pipfile . ...
-
Add
ARG DEV_CSCI_UTILS
just after the firstFROM ...
line
-
-
In
docker-compose.yml
, add a build arg, so that it looks like:build: context: . args: - DEV_CSCI_UTILS=1
-
Modify
setup.py
-
Add:
from ast import literal_eval import os DOCKER_DEV = literal_eval(os.environ.get("DEV_CSCI_UTILS", "0"))
-
Modify
use_scm_version= ... if not DOCKER_DEV else False
-
Modify:
def read(*names, **kwargs): try: ... except FileNotFoundError: if DOCKER_DEV: return "" raise
-
You should now be able to use docker to pipenv install -e .
below. Don't
forget to docker-compose build
after every pipenv instal ...
command you
execute.
The following 'editable' install links to the package in this directory. You
should never pipenv install anything else for a library; rather, dependencies
should be added into setup.py and then pipenv update
or similar.
You may, however, do something like pipenv install --dev pytest
since those
are not real requirements and pipenv doesn't directly handle tests_require
.
# Either direct or via ./drun_app
pipenv install -e .
Tox will help you test that you have your dependencies set up correctly
Uncomment the lines you commented out before. docker-compose build
and
drun_app
should now work.
You can now commit the docker and pipenv files.
We also need to reconcile the tox matrix with the single job needed for coverage etc reporting in Code Climate. Create two build stages, 'test' and 'tox', and add in the code climate config from Pset 1 to your travis file. Designate the tox builds as the tox stage. It should look something like this:
...
matrix:
include:
- env:
- TOXENV=py36
python: '3.6'
stage: tox
...
- stage: test
python: '3.7'
install:
- pipenv install --dev --ignore-pipfile --deploy
before_script: # code coverage tool
- curl -L https://codeclimate.com/downloads/test-reporter/test-reporter-latest-linux-amd64 > ./cc-test-reporter
- chmod +x ./cc-test-reporter
- ./cc-test-reporter before-build
script: pytest --cov-report xml --cov-report term
after_script:
- ./cc-test-reporter after-build --exit-code $TRAVIS_TEST_RESULT
stages:
- test
- tox
...
This should run your Pipenv env and tests first, pushing results to CodeClimate, and only run the tox matrix if that passes.
When Travis runs, you should see something like this:
(this step is new this semester, help us find any bugs!)
Configure your library to use setuptools_scm, following the instructions there, to automatically get your package version from your git repository.
If you see any references to manually coded versions or bumpversion, delete them.
Verify via:
python setup.py --version
python -c "import csci_utils; print(csci_utils.__version__)"
Both commands should print the same thing, which will look something like this:
0.0.1.dev4+gdfedba7.d20190209
.
Tag your master v0.1.0
(eg, git tag v0.1.0
). Now verify:
python setup.py --version
From now on, all commits on master must have an accompanying semantic version tag.
When you later install this project into a problem set, if installed from a
clean repo with a tag version, you'll get a nice version like 0.1.2
. If,
however, you inspect the __version__
in your package from your git repo,
you'll get a nice 'dirty' version number like '0.2.1.dev0+g850a76d.d20180908'
.
This is useful for debugging, building sphinx docs in dev, etc, and you never
have to specify a version except via tagging your commit.
NB: the cookiecutter template now defaults to using setuptools_scm. We have not
sufficiently explored how it is using it. Feel free to improve on the
instructions above so long as you get the version both in setup.py and
csci_utils.__version__
programatically.
During a late-night reading session, you notice that someone else has implemented an atomic writer for python! Almost certainly they have done a better job of ensuring it works correctly, and we don't want the responsibility of maintaining this kind of thing!
You should rewrite your atomic_write function in this repository. Create a structure like:
src/
csci_utils/
__init__.py
io/
__init__.py
tests.py
Note that the atomicwrites
package may not implement every feature you did,
such as preserving the extension. You must find a way to ensure they are still
implemented! All the tests from your previous problem set should pass.
Here are a few tips you can use:
# You can import and rename things to work with them internally,
# without exposing them publicly or to avoid naming conflicts!
from atomicwrites import atomic_write as _backend_writer, AtomicWriter
# You probably need to inspect and override some internals of the package
class SuffixWriter(AtomicWriter):
def get_fileobject(self, dir=None, **kwargs):
# Override functions like this
...
@contextmanager
def atomic_write(file, mode='w', as_file=True, new_default='asdf', **kwargs):
# You can override things just fine...
with _backend_writer(some_path, writer_cls=SuffixWriter, **kwargs) as f:
# Don't forget to handle the as_file logic!
yield f
Remember: do not pipenv install atomicwrites
. This is now a project dependency
and must be included in your setup.py.
Port over everything else from pset 1 you may want to reuse in the future. Ensure the tests are included inside the csci_utils package, as we will test them in future application environments.
Continue to add functionality to this repo as we go along the course!
If you'd like, go back and update pset 1 to use this repo. We won't come back to pset 1 directly, but this will give you a head start configuring new repos to use your utils library.
All instructions below refer to the pset 1 repo, not csci_utils.
Normally, we can pip/pipenv install straight from github (or any git repo) to
any Docker image or Travis build. However, we have a few hoops to jump through
since csci_utils
is a private repo and we're not managing all of the deploy
keys. Inside a company with a private VPN/git server/build system, best
practice is just to make everything publicly readable behind the VPN and not
deal with deploy authentication.
We have a few choices:
- Create a deploy/user key for SSH. This is preferred, but is a bit trickier to manage, especially on windows.
- Hard code your git username/password into your dockerfile (no, we are not going to do that!)
- Provide an API token through the environment. This will allow us to clone via https without altering our Pipfile. This option should be the easiest for this class.
See Travis docs here. Note: To access personal tokens, on the GitHub Applications page, you need to click Developer Settings, or directly navigate here.
DO NOT SHARE THIS TOKEN WITH ANYONE. It gives access to all your github repos. If it becomes compromised, delete it from github and generate a new one. You will be uploading this token to Travis, but it is private only to you.
For more reference on security, see Travis Best Practices and Removing Sensitive Data.
Add the following lines to the Dockerfile, just below
the FROM
line:
ARG CI_USER_TOKEN
RUN echo "machine github.com\n login $CI_USER_TOKEN\n" >~/.netrc
And modify the build section in the docker-compose.yml
:
build:
context: .
args:
- CI_USER_TOKEN=${CI_USER_TOKEN}
You then need to set CI_USER_TOKEN
as an environment variable or in your
dotenv file.
If you're not using docker, you can add the .netrc to your host system (mac/linux) or try this solution on Windows with the token coded directly into the file.
If you don't do this step directly, pip install will ask for your github credentials and pipenv will hang indefinitely when you try to install from a private repo.
You must then add the variable to the Travis environment as well; you can do
that via navigating to the settings, eg
https://travis-ci.com/csci-e-29/your_repo/settings, via the Travis
CLI, or encrypting into the
.travis.yml
as instructed on the first Travis link above. The token should
NOT be committed to your repo in plain text anywhere.
In .travis.yml
, you can also add:
before_install:
- echo -e "machine github.com\n login $CI_USER_TOKEN" > ~/.netrc
Note that we may switch to docker builds instead of travis builds in the future; if travis is building via the docker env, setting the netrc file in docker is sufficient.
You can now install your csci_utils
as below. Note that the #egg part is
important, it is not a comment!
pipenv install -e git+https://github.com/csci-e-29/2019fa-csci-utils-GITHUBID#egg=csci_utils
This will include the latest master commit (presumably tagged) and will be
automatically updated whenever you run pipenv update
. If you want to be more
specific about the version, you can use the @v1.2.3
syntax when you install,
or add ref='v1.2.3
to the specification in the Pipfile
. Leaving this to
automatically check out the latest master is easiest and a good reason to have
merge-only master releases!
In your setup.cfg
, ensure the addopts
section includes --pyargs
. At
this point, after building the docker image, pytest csci_utils
should run all
the tests in your utils package!
You can run them by default if you like, by adding csci_utils
to testpaths
in the config
file.
Otherwise, you should update your .travis.yml
to explicitly run them. You can
do so in the same test stage, or you could create a separate test stage just to
test your utils. Normally, the latter is preferred - it gives nice isolation.
However, it will require travis to reinstall the environment, which is
suboptimal.
Every time you push a new master version to github, you may update the installed version
in your pset applications via pipenv update