I studied SVM and wrote a demonstration in MATLAB (because I could not run a quadratic programming software in Python correctly). For the moment, it is simple and can only do linearly separable cases (nonlinear kernels will be implemented later). This is the first MATLAB program I have written outside of a MATLAB seminar that I followed, all your ideas would be welcome.
% simple SVM demo clc; erase everything; % of the input data; there are labels for the data y = [ -1; -1; -1; -1; -1; 1; 1; 1; 1; 1; 1 ]; data = [ 0 3; 1 2; 2 1; 3 3; 5 1; 5 5; 6 5; 6 7; 7 3; 8 6; 8 1; ]; % distinct positive and negative data points posDps = data (find (y == 1), :); negDps = data (find (y == -1), :); % space is the number of dimensions (n-space) % at this time, b / c being drawn in 2D, can only be 2 spaces space = size (data, 2); % n is the number of input variables n = length (data); [ y1, y2 ] = mesh (y, y); [ i, j ] = mesh (1: n, 1: n); % generate matrices for minimization P (i, j) = y1 (i, j). * Y2 (i, j). * (Data (i, :) * data (j, :) & # 39;); q = -1 * ones (n, 1); % generate matrices for the constraint of inequality (alpha> = 0) % can also use the LB parameter A = -1 * eye (n); b = zeros (n, 1); % generate matrices for the equality constraint Aeq = y & # 39; beq = [ 0 ]; % uses the quadprog package to generate alphas alpha = quadprog (P, q, A, b, Aeq, beq, , , , optimoptions ('quadprog,' 'Display', 'off'); % calculates w from the alpha w = (data & # 39 ;. * repmat (y & # 39; [space 1])) * alpha; % finding the parameter b % (y_n) (x_n * w + b) = 1 for the support vector, so b = y_n - w * x_n threshold = 1e-5; svIndices = find (alpha> threshold); b = y (svIndices (1)) - data (svIndices (1), :) * w; % display points FIG; Wait; dispersion (posDps (:, 1), posDps (:, 2)); dispersion (negDps (:, 1), negDps (:, 2)); % line drawn (only 2D for the moment) margin = 1; domain = (min (data (:, 1)) - margin) :( max (data (:, 1)) + margin); plot (domain, (w (1). * domain + b) / (- w (2))); % tracers of gutters plot (domain, (-1 + w (1). * domain + b) / (- w (2)), & gt; g: & gt; plot (domain, (1 + w (1). * domain + b) / (- w (2)), & g:;;
I would like to have your comments, but here are some specific questions that I had in mind:
- To store one-dimensional data (tables), is there usually a preference between column and row matrices?
Is there a simple way to manage nested matrices? I've tried combining labels and attribute vectors as follows:
data = [ -1 [ 0 3 ]; -1 [ 1 2 ]; % ... ]
and use more clues (for example,
data (1,2,2)) but it did not work.
- There are a lot of different ways to create matrices, and I do not know if the ones I use are the most appropriate (for example, via literals,
repmat (), etc.).
- What is to be done against arbitrary imprecision? For example, I had to put an arbitrary value
thresholdto find non-zero values because the use of
find (A> 0)did not work. (As the values were about 1e-12, they were greater than the value of
eps ()so this function did not seem very useful.)
- Naming conventions – is there a solution for MATLAB? I do not think I saw a consistent style.