learning-kafka

Just an attempt to learn kafka and its integration with Spark

Installation Notes

Python libraries required

kafka-python version 0.9.4-dev used for this project
requests
uritools
pyspark comes bundled with spark.
optional simplejson

How to run spark kafka integration directly from PyCharm.

<SPARK-HOME> refers to the directory where you extracted spark.

Download http://search.maven.org/remotecontent?filepath=org/apache/spark/spark-streaming-kafka-assembly_2.10/1.4.0/spark-streaming-kafka-assembly_2.10-1.4.0.jar
Copy it to <SPARK-HOME>/lib
edit <SPARK-HOME>/conf/spark-defaults.conf (if doesn't exist, create file or rename spark-defaults.conf.template to spark-defaults.conf in the same directory)
add the following line to the end of the

spark.driver.extraClassPath <SPARK-HOME>/lib/spark-streaming-kafka-assembly_2.10-1.4.0.jar
eg. spark.driver.extraClassPath /usr/spark/lib/spark-streaming-kafka-assembly_2.10-1.4.0.jar

to your ~/.bashrc append the following lines

export SPARK_HOME=<SPARK-HOME>
eg. export SPARK_HOME=/usr/spark
save and exit
OR in your IDE add this to your runtime environment variables
OR when you create spark-context pass this path as an argument.

add pyspark module (<SPARK-HOME>/python to PYTHONPATH There are may ways to do it. I personally like this http://stackoverflow.com/a/12311321

You can just run your program straight from command line no need to reach out to spark-submit for your development code. python my_kafka_spark_example.py :)

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
config		config
simple_kafka		simple_kafka
simple_oauth		simple_oauth
simple_spark_example		simple_spark_example
LICENCE		LICENCE
README.md		README.md
configurations_template.ini		configurations_template.ini
main.py		main.py
simple_consumer.py		simple_consumer.py
twitter.py		twitter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config

config

simple_kafka

simple_kafka

simple_oauth

simple_oauth

simple_spark_example

simple_spark_example

LICENCE

LICENCE

README.md

README.md

configurations_template.ini

configurations_template.ini

main.py

main.py

simple_consumer.py

simple_consumer.py

twitter.py

twitter.py

Repository files navigation

learning-kafka

Installation Notes

How to run spark kafka integration directly from PyCharm.

About

Releases

Packages

Languages

License

joychugh/learning-kafka

Folders and files

Latest commit

History

Repository files navigation

learning-kafka

Installation Notes

How to run spark kafka integration directly from PyCharm.

About

Resources

License

Stars

Watchers

Forks

Languages