Tool for analyzing code lineage
At a previous job, an audit of intellectual property rights for software sparked an effort to determine the legacy of each file in our repositories. Unfortunately, the codebase has had many homes over the years including:
- Concurrent Versions System (CVS)
- Revision Control System (RCS)
- Subversion (SVN)
- Microsoft Visual Source Safe (VSS)
- Simple file/folder organizations
Since each version control system has its own unique peculiarities, a great deal of effort was expended scraping and normalizing history information from each. While tools for individual repositories (typically converters) do exist, many are poorly maintained or bug-ridden. This project aims to provide a uniform interface for reading (and eventually writing) to such version control systems.
- Support reading/writing to Mercurial (Hg), Git, and SVN
- Support scraping file history into common format
- Support basic commit analysis to simplify commit actions
- Detect moves/copies where not directly implemented by repository
- Detect compound changes like move/copy followed by modification
- Support proprietary repositories such as VSS and Perforce
- Support "legacy" repository types such as CVS and RCS