Skip to content

gerv/slic

Repository files navigation

This is slic, the Speedy LIcense Checker. It scans a codebase and identifies
the license of each file. It can also optionally extract the license text
and a list of copyright holders - this is very useful for meeting the
obligations of licenses which require reproduction of license text. It outputs
data in a JSON structure (by default) which can then be further processed by
other tools.

It runs at about 120 files per second on a single core of a 3GHz Intel I7
(it's CPU-bound, at least on a machine with an SSD). So you can do 100,000
files in less than 15 minutes. Parallel invocation is left as an exercise for
the reader, but could easily be bolted on the side by dividing up the inputs.

You can extend slic to identify new licenses or tweak the detection of existing
ones by adding to a regexp-containing structure in the license_data.py file.

There is a test suite which makes sure it continues to correctly detect the
licenses it already knows about. This uses the "nose" testing framework -
simply run "nosetests" to run the test suite. You can add files to the slic
detection test suite easily using the add_test_case script.

One additional tool is flic, which stands for Format LICenses. This takes
the output of slic, processes it, and passes the result to a Jinja2 template.
Again, this is very useful for generating an "Open Source Licenses" page or
file to meet obligations about reproducing the text of licenses. 

Running slic
-------------

Use slic --help for help.

To try it out on data which ships with slic:

./slic -D --plain test/data/identification

Run slic over a codebase and generate JSON output:

cd /usr/src/mycodebase
/path/to/slic . > occurrences.json

Or, if your needs are more complex:

* Put a list of all the files and paths you want to scan in "slic-paths.txt"

* Create a file in the slic directory called "mycodebase.ini" with any
  codebase-specific config (see b2g/b2g.ini for inspiration)
  
cd /usr/src/mycodebase
/path/to/slic --config=mycodebase.ini < slic-paths.txt > occurrences.json

Running flic
------------

Requires the existence of an output file generated by slic.

/path/to/flic --input occurrences.json --template license.html > out.html

The template needs to be in the $FLIC_DIR/flic_templates directory.



About

Speedy LIcense Checker and associated tools

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published