Skip to content

irvcaza/datalake4os

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

datalake4os

Development architecture to test data lake concepts in the context of the generation and publication of statistical and geographic data.

Vision

img

How to run

docker-compose up -d

After a short wait, a jupyter notebook can be accesed in any web browser at http://[docker-ip]:8888 If you ar running docker localy then it should be http://localhost:8888 And the password is 'password' Open the notebook in the directory 'scripts', follow the instructions and play arround with the examples.

You can also acces to web interfaces for

  • Minio at port :9000 . User minio password minio123
  • Trino at port :8080 . Just write aniting and it will let you pass

to stop

docker-compose down

Note: If you are runnig it on windows and you receive an error ': No such file or directory' on hive-metastore. You have to modify line endings to linux style on the files of services/hive/bin those files are

  • services/hive/bin/entrypoint.sh
  • services/hive/bin/hivemetastore
  • services/hive/bin/hiveserver
  • services/hive/conf/hive-env.sh

One paritial way to avoid this is using.

git config --global core.autocrlf false

Inspired

This work is inspired by the repository: https://github.com/zhenik-poc/big-data-stack-practice

About

Development architecture to test data lake concepts in the context of the generation and publication of statistical and geographic data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 63.4%
  • TypeScript 32.0%
  • SCSS 2.8%
  • Jupyter Notebook 0.7%
  • HCL 0.4%
  • Dockerfile 0.2%
  • Other 0.5%