Skip to content
forked from nabinn/venus

VenUS: Venmo Usage Statistics; An automatic data processing pipeline for venmo transactions.

License

Notifications You must be signed in to change notification settings

zhangshuo1996/venus

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VenUS: Venmo Usage Statistics

An automatic data processing pipeline for venmo transactions.

Project Description

The aim of the project is to rank Venmo users on a daily basis based on their activity .i.e. frequency of transaction and amount spent.

Following 4 metrics are calculated every day:

  1. Number of times a user sends money
  2. Number of times a user receives money
  3. Numer of times a pair of uesrs take part in a transaction
  4. The amount a user spends per day (i.e. amount sent - amount received)

Data Pipeline

The data is present in s3 bucket. Spark reads the data and calculates the aggregated result. The result is then saved to MySQL database. The user can then query the database via a frontend that is implemented using Flask.

The entire process is automated using Airflow such that whenever there is new data in the bucket, spark jobs are triggered and the results are saved to the database. The application also allows for the monitoring of the workflow and notify the user in case of failure.

Links

About

VenUS: Venmo Usage Statistics; An automatic data processing pipeline for venmo transactions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 72.5%
  • HTML 17.2%
  • TSQL 5.4%
  • Shell 3.0%
  • JavaScript 1.2%
  • CSS 0.7%