Skip to content

rvsandeep/git-monitor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Github as a Growth Monitor

As part of my Insight Project, I built a data pipeline to monitor Github code eco-system and detect projects depending on vulnerable packages. It could be used by DevOps or SRE team to get when dependent packages have security vulnerability.

Table of Contents

  1. Introduction
  2. Approach
  3. Project Structure
  4. Environment
  5. Run Instructions

Introduction

Git-Monitor is a platform to enable organizations to leverage GitHub data and track internal package versions along with their dependencies. Recent events like EquiFax Data Breach where attackers exploited a vulnerable library being used by equifax to gain access to critical financial information of millions of people strengthens the importance of organizations to monitor the third party libraries, internal systems depend on.

Google Slides
Project Link

Approach

Project Structure

The directory structure for the repo is of following format :

      ├── README.md
      ├── execute.sh
      ├── Makefile
      ├── requirements.txt
      ├── src
      │   └──main.py
      │   └──credentials.py
      │   └──jobs
      │       └── create_project_nodes.py
      │       └── create_version_nodes.py
      │       └── create_dependencies.py
      │       └── database_operations.py
      ├── models
      |   └── project.py
      |   └── language.py
      |   └── license.py
      |   └── platform.py
      |   └── status.py
      |   └── version.py
      ├── tests
      ├── libs
      ├── utils
          └── util.py

src/main.py is the main driver of the application.
credentials.py is used to define NEO4J and AWS access credentials.
All the spark jobs are placed in src/jobs folder.
The /models folder hosts the different data models for neo4j.

Environment

Instructions to run the code

Future Work

About

A Data Engineering project as part of my fellowship at Insight Data Science

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published