Skip to content

pfattahi/AIRLINE_RANKINGS

Repository files navigation

#Dataset files can ben found at http://www.sfu.ca/~alopes/732_project/dataset.zip #Report and results at http://www.sfu.ca/~vpalenge/index.html

#Instructions to run Daemon cd tweet_stream
nohup python3 tweet_stream2.py >out.txt &
The command is running and saving the twitters inside the folder 'airlines_tweets'!

#Copy the files to be processed hdfs dfs -copyFromLocal airlines_tweets
back to the main directory: cd ..
copying the training file: hdfs dfs -copyFromLocal airline.csv

#Executing spark-submit --master=yarn-client --executor-memory 6G --num-executors 15 --packages TargetHolding/pyspark-cassandra:0.3.5,com.databricks:spark-csv_2.11:1.5.0 --py-files svm.zip process_twitter.py airlines_tweets airline.csv vap

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages