Skip to content

xuleisanshi/rhadoop-examples

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rhadoop-examples

Repository of applications built with Rhadoop.

Wordcount

There's no need for introductions.

wordcloud

Sentiment Analysis

Receive as input tweet messages and apply a sentiment analysis function to each message. The function was built by Jeffrey Breen, with opinion lexicons based on a book by Bing Liu among other materials.

Take also a look at the packages sentiment (archived on 2012) and qdap (function polarity).

To generate the dataset you have to configure the twitter-streaming.py script with you account information. To execute it run the following command:

./twitter-streaming.py <keyword> <language> > <output_file>

The test sentiment-analysis-test.R requires tweet messages with geolocation information to plot the results in a map. To filter the dataset use the command:

./tweet-filter.py <input_file> coordinates

sentiment-analysis-map

K-mers

Creates a histogram of the k-mers of a sequence in the FASTQ format. The test dataset is a sample from the e. coli bacteria, the full dataset can be downloaded from here.

k-mers-histogram

Other applications

  • Inverted Index
  • Inverted Citations

License

MIT

About

Some mapreduce applications using Rhadoop

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 86.5%
  • Python 13.5%