Skip to content

Australian Census 2016 Twitter Sentiment Analysis, Topic Modelling and Geocoded Tweets

Notifications You must be signed in to change notification settings

rjshanahan/Census_2016_Twitter_Insights

Repository files navigation

Australian Census 2016 Twitter Insights

Sentiment Analysis, Topic Modelling and Geocoded Tweets

What is the Census? According to the Australian Bureau of Statistics:
The Census of Population and Housing (Census) is Australia’s largest statistical collection undertaken by the Australian Bureau of Statistics (ABS). For more than 100 years, the Census has provided a snapshot of Australia, showing how our nation has changed over time, allowing us to plan for the future.

Analysis and the interactive webapp are bolted together as follows:

  • Twitter 'tweet' data is streamed via the Twitter Streaming API using Python and Tweepy
  • Streamed tweets are written to and stored in the NoSQL database MongoDB
  • Tweets are 'enriched' using Sentiment Analysis using the Aylien Python API
  • Tweets are geocoded (where latitude/longitude info is not available via Twitter) using the user's location tag via the Google Maps API
  • R is then use to produce the following:
    • Sentiment coloured map pins using maps from Leaflet for R and Mapbox
    • various Text Mining largely resulting in a Topic Modelling using Plotly
    • various Twitter insight visualisations using Plotly
  • all this is bolted together as an interactive visualisation using the amazeballs ShinyApps from RStudio
  • the MongoDB instance and Python-hosting server are hosted on Amazon Web Services EC2
  • the Python Twitter Streaming API script is managed on the AWS EC2 instance by a cronjob

####The visualisation consists of three main tabs:

ShinyApp Tab Content
Geocoded Tweets Map pins shaded by polarity or subjectivity
Topic Model Topic Model showing topic development over time
Other Twitter-y Stuff Various visualisation inc. wordcloud, top words, top users

####Definitions for Interactive Census Tweet Explorer text analytics components:

Attribute Description Visualisation Use
polarity Natural language processing was used to determine the overall polarity of the tweet - was it positive, negative or neutral. Polarity can be considered an indicator as to the emotional state being expressed in the tweet, such as angry, happy or indifferent colouring
subjectivity Natural language processing was used to determine the overall subjectivity of the tweet - was it subjective or objective. This can be a difficult challenge as tweets may contain subjetive and objective terms. It can be considered a measure of a statement of fact versus opinion colouring
Note: sentiment and subjectivity analysis was undertaken using the Python API from Aylien

Twitter

MongoDB

Aylien

About

Australian Census 2016 Twitter Sentiment Analysis, Topic Modelling and Geocoded Tweets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published