7 Regularization: Lineaer regression uses loss function & uses coefficient for each coefficient for each feature Large objects lead to overfitting by penalizing the large coefficient (regularization) 1st type of regualarization is ridge regression: SEE IMAGES from sklearn.linear_model import Ridge X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # normalize mean all variable are in same state ridge = Ridge(alpha=0.1 , normalize=True) ridge .fit(X_train, y_train) ridge_pred = ridge.perdict(X_test) 2nd type is LASSO function loss function = OLS loss function + alpha * sum |absolute values| #see IMAGES Can be used to select important feature of dataset Shrinks the coefficient of less importance to zero # Classification reports and confusion matrices are great methods to quantitatively evaluate model performance, 8 LOGISTIC REGRESSION FOR BINARY CLASSIFICATION probabilty is greater than 0.5 [data labeled as 1] probabilty is less than 0.5 [data labeled as 0.5] ROC CURVE - [Receiver opearting curve] - is got by chnaging the threshold value[p]