-
Context:
Classroom lectures given at KBC, a financial institute in Belgium, between the dates of 2019-05-13 and 2019-05-15.
-
Objectives:
- Introduce good data engineering practices.
- Illustrate modular and easily testable data transformation pipelines using Pyspark.
-
Audience:
Employees of KBC involved in writing (porting?) transformation pipelines. General knowledge level: junior - medior.
Participants were asked (by KBC personel) to go through two online Python courses prior to participation.
-
Approach:
Lecturer first sets the foundations right for Python development and gradually builds up to pyspark data pipelines. There is a high degree of participation expected from the students: they will need to write code themselves and reason on topics, so that they can better retain the knowledge.
Course notes will be made available after the day sessions. Materials needed for the exercices will be provided through Github directly.
kdebaerdemaeker/pyspark_training
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published