All-at-once optimization for CP tensor decomposition
We explain how to use cp_opt with the POBLANO toolbox.
Contents
Poblano Optimization Toolbox
Check that you have Poblano 1.1 installed. The output of your 'ver' command should look something like the following.
ver
---------------------------------------------------------------------------------------------------- MATLAB Version: 8.4.0.150421 (R2014b) MATLAB License Number: 192525 Operating System: Microsoft Windows 7 Enterprise Version 6.1 (Build 7601: Service Pack 1) Java Version: Java 1.7.0_11-b21 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode ---------------------------------------------------------------------------------------------------- MATLAB Version 8.4 (R2014b) Optimization Toolbox Version 7.1 (R2014b) Poblano Toolbox (Sandia National Labs) Version 1.1 Statistics Toolbox Version 9.1 (R2014b) Tensor Toolbox (Sandia National Labs) Version 2.6
Create an example problem. Here we have 10% noise.
R = 5; info = create_problem('Size', [50 40 30], 'Num_Factors', R, 'Noise', 0.10); X = info.Data; M_true= info.Soln;
Create initial guess using 'nvecs'
M_init = create_guess('Data', X, 'Num_Factors', R, ... 'Factor_Generator', 'nvecs');
Set up the optimization parameters
It's genearlly a good idea to consider the parameters of the optimization method. The default options may be either too stringent or not stringent enough. The most important options to consider are detailed here.
% Get the defaults ncg_opts = ncg('defaults'); % Tighten the stop tolerance (norm of gradient). This is often too large. ncg_opts.StopTol = 1.0e-6; % Tighten relative change in function value tolearnce. This is often too large. ncg_opts.RelFuncTol = 1.0e-20; % Increase the number of iterations. ncg_opts.MaxIters = 10^4; % Only display every 10th iteration ncg_opts.DisplayIters = 10; % Display the final set of options ncg_opts
ncg_opts = Display: 'iter' DisplayIters: 10 LineSearch_ftol: 1.0000e-04 LineSearch_gtol: 0.0100 LineSearch_initialstep: 1 LineSearch_maxfev: 20 LineSearch_method: 'more-thuente' LineSearch_stpmax: 1.0000e+15 LineSearch_stpmin: 1.0000e-15 LineSearch_xtol: 1.0000e-15 MaxFuncEvals: 10000 MaxIters: 10000 RelFuncTol: 1.0000e-20 RestartIters: 20 RestartNW: 0 RestartNWTol: 0.1000 StopTol: 1.0000e-06 TraceFunc: 0 TraceFuncEvals: 0 TraceGrad: 0 TraceGradNorm: 0 TraceRelFunc: 0 TraceX: 0 Update: 'PR'
Call the cp_opt method
Here is an example call to the cp_opt method. By default, each iteration prints the least squares fit function value (being minimized) and the norm of the gradient. The meaning of any line search warnings can be checked via doc cvsrch.
[M,~,output] = cp_opt(X, R, 'init', M_init, ... 'alg', 'ncg', 'alg_options', ncg_opts);
Iter FuncEvals F(X) ||G(X)||/N ------ --------- ---------------- ---------------- 0 1 59446.20284454 0.76018888 10 80 1154.53391434 1.87155134 20 131 583.85024670 0.02127047 30 157 583.79035838 0.00013732 40 177 583.79035282 0.00000162 42 181 583.79035282 0.00000071
Check the output
It's important to check the output of the optimization method. In particular, it's worthwhile to check the exit flag. A zero (0) indicates successful termination with the gradient smaller than the specified StopTol, and a three (3) indicates a successful termination where the change in function value is less than RelFuncTol. The meaning of any other flags can be checked via doc poblano_params.
exitflag = output.ExitFlag
exitflag = 0
The fit is the percentage of the data that is explained by the model. Because we have noise, we do not expect the fit to be perfect.
fit = output.Fit
fit = 99.0183
Evaluate the output
We can "score" the similarity of the model computed by CP and compare that with the truth. The score function on ktensor's gives a score in [0,1] with 1 indicating a perfect match. Because we have noise, we do not expect the fit to be perfect. See doc score for more details.
scr = score(M,M_true)
scr = 0.9984
Overfitting example
Consider the case where we don't know R in advance. We might guess too high. Here we show a case where we guess R+1 factors rather than R.
% Generate initial guess of the corret size M_plus_init = create_guess('Data', X, 'Num_Factors', R+1, ... 'Factor_Generator', 'nvecs');
% Loosen the stop tolerance (norm of gradient).
ncg_opts.StopTol = 1.0e-2;
% Run the algorithm [M_plus,~,output] = cp_opt(X, R+1, 'init', M_plus_init, ... 'alg', 'ncg', 'alg_options', ncg_opts); exitflag = output.ExitFlag fit = output.Fit
Iter FuncEvals F(X) ||G(X)||/N ------ --------- ---------------- ---------------- 0 1 59446.61102123 0.63350080 10 80 1154.62477976 1.55962386 20 131 583.89115606 0.01846313 21 134 583.85651979 0.00932713 exitflag = 0 fit = 99.0182
% Check the answer (1 is perfect)
scr = score(M_plus, M_true)
scr = 0.9983