relink git issues to jira

See also the related blog post for more information!

requirements

dulwich (Docs)
swig
gpgme
matplotlib
python-gpg: pip3 install --user gpg

To get them quickly and test that everything works:

guix environment -l guix.scm
for i in *.py; do python3 $i --test; done

License

Apache Public License 2.0. See COPYING.

usage

Retrieve commits and issue-IDs from Git repo

./retrieve_commits_and_issues.py [--with-files] [--output TODO_FILE.todo] [--previous OLD_TODO_FILE.todo] PATH_TO_GIT_REPO ...

commit-issue pairs included in the OLD_TODO_FILE are not added to the TODO_FILE.

Store repository info

./retrieve_repository_info.py [--output INFO_FILE.repoinfo] PATH_TO_GIT_REPO

Link commits to Jira

./link_commits_to_issues.py [--create-the-links] [--jira-api-server URL] [--netrc-gpg-path jira-netrc.gpg | --jira-user USER --jira-password PASSWORD] --repo-info-file FILE.repoinfo FILE.todo

credentials via netrc

prepare netrc:

echo machine jira.HOST.TLD login USER password PASSWORD | gpg2 -er MY_EMAIL@HOST.TLD > jira-netrc.gpg

todo file format

<commit> <issue> <isodate> <message with linebreaks replaced by "---" >

There can be multiple entries per commit: one per issue referenced.

The entries are ordered in commit_time order: newest commits first (they are the most important ones to have right).

History analysis

Files affected per issue

./retrieve_commits_and_issues.py --with-files --output issues-and-files.log ./
./correlate_files_per_issue.py issues-and-files.log --count-files-per-issue | sort > files-affected-by-time-with-issue.dat
./plot.py files-affected-by-time-with-issue.dat

Only bugs

# ...
# get all jira bugs:
# ./find_all_bugs.py --jira-api-server https://jira.HOST.TLD > all-bugs.log
# stats
./retrieve_commits_and_issues.py --with-files --output issues-and-files.log ./
./correlate_files_per_issue.py issues-and-files.log --count-files-per-issue  -i all-bugs.log | sort > files-affected-by-time-with-issue-only-bugs.dat
./plot.py files-affected-by-time-with-issue-only-bugs.dat

Aggregated file size of changed files per issue

./retrieve_commits_and_issues.py --with-files-and-sizes --output issues-and-files.log ./
./correlate_files_per_issue.py issues-and-files.log --sum-filesizes-per-issue | sort > sum-filesize-by-time-with-issue.dat
./plot.py sum-filesize-by-time-with-issue.dat

Create nodelists and edgelists

./retrieve_commits_and_issues.py --with-files-and-sizes --output issues-and-files.log ./
./correlate_files_per_issue.py --file-connections issues-and-files.log --debug --output-edgelist all-issues-edgelist-max300.csv --output-nodelist  all-issues-nodelist-max300.csv

Analyze the CSVs with graph software like Gephi.

Subselect a graph for a specific module

With the example of MODULE_FOO, runtime of a few hours in a 1 million line codebase.

This needs ripgrep in addition to the other dependencies.

./retrieve_commits_and_issues.py --with-files-and-sizes --output issues-and-files.log ./
./correlate_files_per_issue.py --file-connections issues-and-files.log --debug --output-edgelist all-issues-edgelist-max300.csv --output-nodelist  all-issues-nodelist-max300.csv
grep MODULE_FOO all-issues-nodelist-max300.csv > all-issues-nodelist-max300-foo.csv
cat all-issues-nodelist-max300-foo.csv | cut -d " " -f 1 > foo-nodeids-raw.txt
time grep -wf foo-nodeids-raw.txt all-issues-edgelist-max300.csv | tee all-issues-edgelist-max300-with-foo.csv
sed s/^/^/ foo-nodeids-raw.txt > foo-nodeids-first.txt
sed "s/^/ /" foo-nodeids-raw.txt | sed "s/$/ /" > foo-nodeids-second.txt
time rg -f foo-nodeids-second.txt all-issues-edgelist-max300-with-foo.csv | tee all-issues-edgelist-max300-to-foo.csv
time rg -f foo-nodeids-first.txt all-issues-edgelist-max300-to-foo.csv | tee all-issues-edgelist-max300-from-foo.csv

Now import all-issues-nodelist-max300-foo.csv and all-issues-edgelist-max300-from-foo.csv into Gephi.

Select a specific timespan

Just change the logfile from retrieve_commits_and_issues.py and select the lines you want. It is ordered by time, newest issue first.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
COPYING		COPYING
README.org		README.org
all-bugs.log		all-bugs.log
correlate_files_per_issue.py		correlate_files_per_issue.py
find_all_bugs.py		find_all_bugs.py
guix.scm		guix.scm
issues-and-files.log		issues-and-files.log
link_commits_to_issues.py		link_commits_to_issues.py
missed-strings.txt		missed-strings.txt
plot.py		plot.py
retrieve_commits_and_issues.py		retrieve_commits_and_issues.py
retrieve_repository_info.py		retrieve_repository_info.py
test_repoinfo.json		test_repoinfo.json
testtask.todo		testtask.todo
tomatch.txt		tomatch.txt

License

DisyInformationssysteme/git-to-jira-links

Folders and files

Latest commit

History

Repository files navigation

relink git issues to jira

requirements

License

usage

Retrieve commits and issue-IDs from Git repo

Store repository info

Link commits to Jira

credentials via netrc

todo file format

History analysis

Files affected per issue

Only bugs

Aggregated file size of changed files per issue

Create nodelists and edgelists

Subselect a graph for a specific module

Select a specific timespan

About

Resources

License

Stars

Watchers

Forks

Languages