Show Menu
Cheatography

Python - Supported Vector Machine (SVM) Cheat Sheet (DRAFT) by

SVM model in Python

This is a draft cheat sheet. It is a work in progress and is not finished yet.

TO START

# IMPORT DATA LIBRARIES
import pandas as pd
import numpy as np

# IMPORT VIS LIBRARIES
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# IMPORT MODELLING LIBRARIES
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import classification_report,confusion_matrix

TRAIN MODEL

SPLIT DATASET
X = df[['c­ol1­','­col­2',­etc.]]
create df features
y = df['col']
create df var to predict
X_train, X_test, y_train, y_test =
train_test_split(
  X,
  y,
  test_size=0.3)
split df in train and test df
FIT THE MODEL
svc= SVC()
instatiate model
svc.fi­t(X­_tr­ain­,y_­train)
train/fit the model
MAKE PREDIC­TIONS
pred = svm.pr­edi­ct(­X_test)
EVAUATE MODEL
print(­con­fus­ion­_ma­tri­x(y­_te­st,­pred))
print(­cla­ssi­fic­ati­on_­rep­ort­(y_­tes­t,p­red))
 

GRID SEARCH EXPLAN­ATION

Finding the right parameters (like what C or gamma values to use) is a tricky task! But luckily, we can be a little lazy and just try a bunch of combin­­ations and see what works best! This idea of creating a 'grid' of parameters and just trying out all the possible combin­­ations is called a Gridse­­arch, this method is common enough that Scikit­­-learn has this functi­­on­ality built-in with GridSe­­ar­chCV! The CV stands for cross-­­va­l­i­dation which is the GridSe­­archCV takes a dictionary that describes the parameters that should be tried and a model to train. The grid of parameters is defined as a dictio­­nary, where the keys are the parameters and the values are the settings to be tested.
======­­==­=­=­==­­===­­==­=­=­==­­===­­==­=­=­==­­===­­==­=­=­==­­=====
C is the parameter for the soft margin cost functi­on, which controls the influence of each individual support vector; this process involves trading error penalty for stability. C is the cost of miscla­­ss­i­f­ic­­ation of training examples against the simplicity of the decision surface. A large C gives low bias and high variance. Low bias because you penalize the cost of misscl­­as­i­f­ic­­ation a lot. A small C gives you higher bias and lower variance.

Gamma is the parameter of a Gaussian Kernel (to handle non-linear classi­­fi­c­a­tion). Gamma controls the shape of the "­­pe­a­k­s" where you raise the points. A small gamma gives a pointed bump in the higher dimens­­ions, a large gamma gives a softer, broader bump. So a small gamma will give you low bias and high variance while a large gamma will give you higher bias and low variance. You usually find the best C and Gamma hyper-­­pa­r­a­meters using Grid-S­­earch.

Kernel will decide the hyperplane you will use to divide the points.
======­­==­=­=­==­­===­­==­=­=­==­­===­­==­=­=­==­­===­­==­=­=­==­­=====
Refit an estimator using the best-found parameters on the whole dataset.

Verbose controls the verbosity: the higher, the more messages.

SVM parameters

GRID SEARCH

from sklear­n.m­ode­l_s­ele­ction import GridSe­archCV
import GridSearch
param_grid = {
'C': [0.1,1, 10, 100, 1000],
'gamma': [1,0.1­,0.0­1,­0.0­01,­0.0­001],
'kernel': ['rbf']}
parame­ters, see info
grid = GridSearchCV(
SVC(),
param_grid,
refit=True,
verbose=3)
parame­ters, see info
grid.f­it(­X_t­rai­n,y­_train)
grid.b­est­_pa­rams_
grid.b­est­_es­tim­ator_
grid_p­red­ictions = grid.p­red­ict­(X_­test)
print(­con­fus­ion­_ma­tri­x(y­_te­st,­gri­d_p­red­ict­ions))
print(­cla­ssi­fic­ati­on_­rep­ort­(y_­tes­t,g­rid­_pr­edi­cti­ons))