Team Members:
- Yizhou Zhu - 1034676
- Shizhe Cai - 798125
- Haoyu Zhang - 976650
- Haowen Shen - 1070497
- Peng Cao - 798530
- React and Redux Frontend Design
- React for building UI components
- Redux for state management, fetching and storing data
- Data Visualisation
- Google Map API for showing data on the map
- AntD for UI design
- Recharts for drawing different charts
- Flask and Cloudant based server
- Use MVC model to design
- Recieve argument from flask request
- Retrieve data and rearrange to a frontend-friendly style
- Error handling and pattern matching of requests
- Gunicorn and Nginx
- Gunicorn multi-core processing
- Nginx load balancing and file cache
- Searching API
- GetOldTweet3 modified for rate limit
- generate matched shell script to run for each local government area in Victoria
- Use shell scripts to automate the spier
- Provide data to Natural Language Processing section
- retreive processed data to upload to CouchDB
- Streaming API
- Stream in all the real-time tweets to CouchDB
- keep a streamlog
- Covid-19
- Customized scraper for the two websites
https://covid-19-au.com/
https://www.dhhs.vic.gov.au/media-hub-coronavirus-disease-covid-19 - Updated and modified on daily basis (no automation)
- Customized scraper for the two websites
- Sentiment Analysis
- Remove non-text tweets.
- Use NLTK Vader to do the intensity analysis.
- Gives a compound score of sentiment, and also a label of negative/positive/neutral based on ±0.05 boundry.
- Topic Modelling
- Remove emails, newline characters and single quotes.
- Using gensim module to continue pre-processing .
- Using bigram and trigram model to enhance word.
- Only keep and Lemmatize NOUN and ADJECTIVE
- latent dirichlet allocation model for topic extraction
- Wordcloud to show the result with top 10 frequent words.
- setup nectar
- install environments
- clone github repository
- deploy applications
Server 1: 172.26.130.162
server 1
- Couchdb Master Node
- front-end
- back-end
- server
- nginx
Server 2: 172.26.130.251
server 2
- Couchdb Worker Node
- Spark Cluster
Server 3: 172.26.132.37
server 3
- Couchdb Worker Node
- Data Havester
Server 4: 172.26.132.136
server 4
- Test Ansible Script