-
Daily fantasy sports are derived from the season long contests in which contestants draft teams and keep that team for the length of the season.
-
In daily fantasy contests, contestants select teams on a daily basis. After the last game has ended, dfs results are posted and winning contestants are paid out.
-
When selecting a team, a contestant is constrained by the salary cap (an allotment of draft funds). Each contestant has the same salary cap.
-
50/50: If a contestant's score is in the top 50% of all entries, the contestant wins the cost of the entry price.
- 50/50 is a safe contest that professional couch potatoes use to build and maintain a bankroll.
-
Tournament: A single contest in which only the top tier wins. Tiers can vary across different types of contests.
- The tournament strategy differs from safer contests because tournaments can have hundreds to tens of thousands of contestants so you have to find a way to separate yourself from the pack.
-
Predict daily NBA player performance in the form of fantasy points for each player using rolling averages of various player/team statistics and information.
- Fantasy scores are a weighted value of a player's box score for a given game.
Stat | Scoring |
---|---|
Point | +1 Pt |
Made 3pt Shot | +0.5 Pts |
Rebound | +1.25 Pts |
Assist | +1.5 Pts |
Steal | +2 Pts |
Block | +2 Pts |
Turnover | -0.5 Pts |
Double Double | +1.5 Pts |
Triple Double | +3 Pts |
-
Using those predictions, construct the optimal lineup through an automated process by selecting players to fit within the positional and salary cap constraints.
-
Compare the lineup chosen by my model to the actual scores along with other lineups chosen by higher scoring contestants.
-
Adjust the features and the models to improve predictive ability.
- Combined 4 different datasets from the My Sports Feeds data API along with historic odds information from Kaggle to create a singular predictive training dataset consisting of player statistics, team statistics, odds and daily fantasy scores.
-
Determined the optimal window size to predict daily player performance, i.e., how many of the previous performances should be used in determining a players performance today?
-
The figure below shows the optimal rolling average window length at 3 days when considering player statistics alone.
-
After dropping low scores and applying a grid search to determine the best parameters, RMSE came in at 8.989 for the GB Model.
-
The predictions below were generated by training a gradient boosting regression model with 10-fold cross validation.
-
After generating predictions, each observation was fed through a knapsack algortihm to determine the optimal lineup.
Lineup | Position | Predicted | Actual | Diff |
---|---|---|---|---|
Jeff Teague | PG | 30.34 | 24.7 | -5.7 |
Russell Westbrook | PG | 51.14 | 55.8 | +4.7 |
Avery Bradley | SG | 19.6 | 24.4 | +4.8 |
Garrett Temple | SG | 17.06 | 15.4 | -1.7 |
Stanley Johnson | SF | 20.6 | 10.9 | -9.7 |
LeBron James | SF | 51.71 | 54.9 | +3.2 |
Bobby Portis | PF | 29.06 | 15.8 | -13.3 |
Frank Kaminsky | PF | 16.91* | 4.7* | -12.2 |
Karl-Anthony Towns | C | 42.93 | 60.4 | +17.5 |
TOTAL | 264.0 | 262.3 | -1.7 |
-
Unfortunately, the lineup shown above was not a winner but was fairly accurate in determining overall score.
-
The Gradient Boosted model does well at finding players that usually do well but fell short in finding high value players at lower costs. This can be remedied with more feature engineering to account for injured players. By identifying injured players, we can find more opportunity and value in players that will benefit from replacing those injured players.
-
Consistent analysis of predictions to determine the successes and failures of the results.
-
Incorporate probability distributions to select riskier and safer lineups for different types of contests.
-
Automate the process from start to finish using a job scheduling utility in order to receive lineup information on a daily basis for multiple sports.