The project is designed to help the people to decide the transportation means to travel to their respective destination.
Purpose: This will help the person to save time and money to reach to a destination.
Common use case:
-
In peek hours of the day, everyone wants to reach to their destination on time especially corporate people and students. This project will give them the suggestion to travel through a particular transportation means(bikes, train, cab).
-
This project will also help them decide the efficient means of transportation with the cheapest price and shortest travel time.
I am using the dataset given by Citi Bike, a privately owned public bicycle sharing system serving in New York, which can be accesssed using the following link:
https://www.citibikenyc.com/system-data
For the cab dataset, I am using the following dataset link:
http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
Since the data is real-time streaming, I am using following technologies:
- Spark Streaming
- Kafka
- PostgreSQL
The following figure shows the overview of the project architecture: