This project provides a scalable analytics platfrom for creating custom analytics with Google Analytics Data & combine with own datasources. This project queries & extracts google analytics via a Google Analytics plugin in Logstash. Any kind of transformations, Mappings can also be done in the logstash.conf file, Also data from another sources can be incorporated. All the data is assembled into ES database, with custom named indices. Kibana is used to create visualization, the indices can be mixed matched to provide custom visualizations which would not have been possible via Google Analytics alone.
This plaftorm can be deployed on AWS using ECS. Automation for shipping the containers is included.
Possible usage scenarios:
- Tying DevOps data with Google Analytics
- Tying Social Media Campagians on Twitters, with website traffic from google analytics
- Creating custom visualizations in Kibana
Based on the official images:
- Install Docker.
- Install Docker-compose.
- Clone this repository
Start the ELK stack using docker-compose:
$ docker-compose up --build
You can also choose to run it in background (detached mode):
$ docker-compose up --build -d
And then access Kibana UI by hitting http://localhost:5601 with a web browser.
See: https://www.elastic.co/guide/en/kibana/current/setup.html#connect
You can also access:
NOTE: In order to use Sense, you'll need to query the IP address associated to your network device instead of localhost.
By default, the stack exposes the following ports:
- 9200: Elasticsearch HTTP
- 9300: Elasticsearch TCP transport
- 5601: Kibana
NOTE: Configuration is not dynamically reloaded, you will need to restart the stack after any change in the configuration of a component.
The Kibana default configuration is stored in kibana/config/kibana.yml
.
All the Dashboards and Visualizations in Kibana can be exported in the form of JSON objects. We have few visualizations and dashboards in the objects folder under Kibana. These can be imported into Kibana from Setting->Objects->Import page. These Dashboards will work with the indices created and data fetch by the default logstash configuration provided.
The logstash configuration is stored in logstash/config/logstash.conf
.
logstash.conf is copied onto the docker container while the containers is being build. So for the changes to be reflected be sure to use the --build flag with docker-compose up.
The Logstash container use the LS_HEAP_SIZE environment variable to determine how much memory should be associated to the JVM heap memory (defaults to 500m).
If you want to override the default configuration, add the LS_HEAP_SIZE environment variable to the container in the docker-compose.yml
:
logstash:
build: logstash/
command: logstash -f /etc/logstash/conf.d/logstash.conf
volumes:
- ./logstash/config:/etc/logstash/conf.d
ports:
- "5000:5000"
links:
- elasticsearch
environment:
- LS_HEAP_SIZE=2048m
To add plugins to logstash you have to:
- Add a RUN statement to the
logstash/Dockerfile
(ex.RUN logstash-plugin install logstash-filter-json
) - Add the associated plugin code configuration to the
logstash/config/logstash.conf
file
The Elasticsearch container is using the shipped configuration and it is not exposed by default.
If you want to override the default configuration, create a file elasticsearch/config/elasticsearch.yml
and add your configuration in it.
Then, you'll need to map your configuration file inside the container in the docker-compose.yml
. Update the elasticsearch container declaration to:
elasticsearch:
build: elasticsearch/
command: elasticsearch -Des.network.host=_non_loopback_
ports:
- "9200:9200"
volumes:
- ./elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
You can also specify the options you want to override directly in the command field:
elasticsearch:
build: elasticsearch/
command: elasticsearch -Des.network.host=_non_loopback_ -Des.cluster.name: my-cluster
ports:
- "9200:9200"
The data stored in Elasticsearch will be persisted after container reboot but not after container removal.
In order to persist Elasticsearch data even after removing the Elasticsearch container, you'll have to mount a volume on your Docker host. Update the elasticsearch container declaration to:
elasticsearch:
build: elasticsearch/
command: elasticsearch -Des.network.host=_non_loopback_
ports:
- "9200:9200"
volumes:
- /path/to/storage:/usr/share/elasticsearch/data
This will store elasticsearch data inside /path/to/storage
.
In the deploy folder we have a python script(build-tag-push.py) which builds, tags & pushes all the docker containers into the docker hub. Docker Hub with Images: https://hub.docker.com/r/harsha149/ The script finally creates the a docker-compose-p:tag.yml for deployment into the production environment.
We can create a ECS Cluster on AWS with the following command (This is done with ecs-cli: http://docs.aws.amazon.com/cli/latest/reference/ecs/)
ecs-cli up --keypair harsha --capability-iam --size 1 --instance-type t2.medium
We can deploy the container using the following command
ecs-cli compose --file docker-compose-p.yml up