This repo contains some of the code I've written for Kaggle's Machine Learning competitions during my most active period (2010-2016).
During that time, I gained a Kaggle Competitions Grandmaster rating and was once ranked as the #4 competitor (99.99th percentile).
Unfortunately, the ranking algorithm incorporates a time decay so my current rating has dropped recently.
For more information about my work on Kaggle, see: https://www.kaggle.com/chefele
These competitions were certainly very educational & a lot of fun.
I need to fill in more details here about particularly interesting competitions. One that immediately comes to mind is the ASAP Essay Grading competition, which used NLP to grade student essays; that system was able to exceed human performance, and that was particularlly satisfying.
Next, one of the great things about Kaggle is that it teaches you to analyze a dataset and get a prototype running extremely quickly & find an effective algorithm.
However, most of the code here would require further polishing to be production-ready. So you will certainly find some clever bits in it, but also some messy and inefficient bits that I have not cleaned up.
In total, for these competitions, I've written approximately:
- 45K lines of Python
- 26K lines of R
- 4K lines of Java
- 3K lines of bash
In addition, I've posted Kaggle 'kernels' code (https://www.kaggle.com/chefele/kernels), as well as some well-received comments and analyses in online discussions (https://www.kaggle.com/chefele/discussion).