Decomposition in multilinear rank-\((L_r,L_r,1)\) terms¶
The decomposition in multilinear rank-\((L_r,L_r,1)\) terms writes a third-order tensor as a sum of \(R\) low multilinear rank terms [25][26][27][28][6]. Each of these low multilinear rank terms can be written as the outer product, denoted by \(\op\), of a rank-\(L_r\) matrix and a vector. For a third-order tensor \(\ten{T}\), the decomposition is given by
The matrices \(\mat{A}_r\) and \(\mat{B}_r\) have \(L_r\) columns and have full rank. A visual representation of this decomposition is shown in Figure 36. The decomposition in (1) is a special case of the polyadic decomposition (PD, see Canonical polyadic decomposition) and the block term decomposition (BTD, see Block term decomposition). Indeed, (1) can be seen as a CPD by writing it as a sum of \(\sum_r L_r\) rank-1 terms
Alternatively, (1) can be formulated as a BTD by writing it as the sum of \(R\) block terms. The \(r\)th term is formed by the tensor-matrix products of a core tensor \(\ten{S}_r\) of size \(L_r\times L_r \times 1\) and matrices \(\mat{A}_r\), \(\mat{B}_r\) and vector \(\vec{c}_r\):
in which \(\tmprod_{n}\) denotes the tensor-matrix product in mode \(n\).
Both formats are supported in Tensorlab, and all methods starting with ll1
allow both the CPD format and the BTD format.
Problem and tensor generation¶
We first show how pseudorandom factors for the decomposition in
rank-\((L_r,L_r,1)\) terms can be generated in the two formats using
ll1_rnd
and conclude with the generation of the full tensors from both
formats using ll1gen
and ful
.
Two different formats for the decomposition in multilinear
rank-\((L_r,L_r,1)\) terms can be used: the CPD and the BTD formulation. The
BTD formulation is the default format for most algorithms, and is represented by
a cell with \(R\) entries, each of which contains a cell with exactly four
entries: the matrices \(\mat{A}_r\) and \(\mat{B}_r\), the vector
\(\vec{c}_r\) and a core tensor \(\ten{S}_r\). All algorithms starting
with ll1
normalize this core tensor to the identity matrix. The command
ll1_rnd
generates pseudorandom factors for a decomposition in
multilinear rank-\((L_r,L_r,1)\). For example, consider a tensor of size
\(10\times11\times12\) that can be written as a sum of three terms with
multilinear ranks \((2,2,1)\), \((3,3,1)\) and \((4,4,1)\), hence
L = [2 3 4]
:
size_tens = [10 11 12];
L = [2 3 4];
U = ll1_rnd(size_tens, L);
U
U{1}
U{1}{4}
U =
{1x4 cell} {1x4 cell} {1x4 cell}
ans =
[10x2 double] [11x2 double] [12x1 double] [2x2 double]
ans =
1 0
0 1
The CPD format can be requested using the 'OutputFormat'
option:
options = struct;
options.OutputFormat = 'CPD';
U = ll1_rnd(size_tens, L, options);
% alternatively, using the new options syntax
U = ll1_rnd(size_tens, L, 'OutputFormat', 'CPD');
U
U =
[10x9 double] [11x9 double] [12x3 double]
The result U
is a cell of length \(R = 3\) with \(\sum_r L_r\)
columns in the first two factor matrices and \(R\) columns in the third
factor matrix. When using the CPD format, the parameter L
is required for
all algorithms, as this parameter provides the necessary information on how the
columns are grouped.
As for the corresponding CPD, BTD and LMLRA methods, additional options can be
provided using the old syntax with an options
structure, or as key-value
pairs:
size_tens = [10 11 12];
L = [2 3 4];
% Use rand to generate factors
U = ll1_rnd(size_tens, L, 'Real', @rand);
% Create complex factors
U = ll1_rnd(size_tens, L, 'Imag', @randn);
Instead of size_tens
, a full, sparse, incomplete or efficient representation
of a structured tensor T
can be given as input, i.e., ll1_rnd(T,L)
.
size_tens
is then determined using getsize(T)
. If the tensor T
(or
ful(T)
) contains complex entries, complex factors are generated for the
factors U
, otherwise real factors are generated.
It is possible to convert a decomposition in multilinear rank-\((L_r,L_r,1)\) terms from
the BTD format to the CPD format and vice versa using the ll1convert
command:
size_tens = [10 11 12];
L = [2 3 4];
Ubtd = ll1_rnd(size_tens, L, 'OutputFormat', 'btd');
Ucpd = ll1convert(Ubtd); % result in CPD format
Ubtd2 = ll1convert(Ucpd, L); % result again in BTD format (do not forget L)
Ucpd2 = ll1convert(Ucpd, L, 'OutputFormat', 'cpd') % explicitly request CPD format
To expand the factorized representation U
of the decomposition in
multilinear rank-\((L_r,L_r,1)\) terms to a full tensor, ll1gen
can be
used:
size_tens = [10 11 12];
L = [2 3 4];
Ubtd = ll1_rnd(size_tens, L, 'OutputFormat', 'btd');
T = ll1gen(Ubtd);
T = ll1gen(Ubtd, L); % L is optional for BTD format
Ucpd = ll1_rnd(size_tens, L, 'OutputFormat', 'cpd');
T = ll1gen(U,L); % L is required for CPD format
The ful
command can also be used, but it only accepts the BTD format. For
example, to generate a vector from the resulting tensor, the following command
can be used:
size_tens = [10 11 12];
L = [2 3 4];
Ubtd = ll1_rnd(size_tens, L);
T = ful(Ubtd); % generate full tensor
t = ful(Ubtd, 1, ':', 1); % generate first row vector
norm(t - T(1,:,1)); % is zero up to machine precision
If the factors U
are in the CPD format, the factors have to be converted to
the BTD format first:
Ucpd = ll1_rnd(size_tens, L, 'OutputFormat', 'cpd');
T = ful(ll1convert(Ucpd, L));
Computing the decomposition in multilinear rank-\((L_r,L_r,1)\) terms¶
To compute a decomposition in multilinear rank-\((L_r,L_r,1)\) terms a
number of algorithms are available: a deterministic algorithm based on the
generalized eigenvalue decomposition ll1_gevd
, optimization-based algorithms
ll1_minf
and ll1_nls
, and the ll1
method. This last method is the
principal method which automatically handles the compression, expansion,
initialization, optimization and refinement, in a similar way as the cpd
method. The ll1
method is a good algorithm choice for many problems. The
optimization-based algorithms ll1_minf
, ll1_nls
and ll1
accept
dense, sparse, incomplete and efficient representations of structured tensors.
If efficient representations are used, and a solution accurate up to machine
precision is needed, make sure to read the note at the end of this chapter.
The principal method¶
The easiest way to compute a decomposition in multilinear
rank-\((L_r,L_r,1)\) terms of a dense, sparse, incomplete or structured
tensor \(\ten{T}\) with \(L=[L_1,L_2,\ldots,L_R]\) is to use the ll1
method:
size_tens = [10 11 12];
L = [2 3 4];
U = ll1_rnd(size_tens, L);
T = ll1gen(U);
Uhat = ll1(T, L);
Internally, ll1
performs a number of steps to provide a good initialization
to reduce the computational cost of the decomposition. First, the tensor is
compressed using a low multilinear rank approximation (see Multilinear singular value decomposition and low multilinear rank approximation)
if it is worthwhile. Then an initialization U0
is generated, e.g.,
using ll1_gevd
. Next ll1
executes an algorithm to compute the CPD
given the initialization, e.g., ll1_nls
. Finally the tensor is uncompressed
and the solution is refined (if compression was applied). Depending on the
kind of input tensor T
, additional substeps are performed. If T
is a
dense tensor, the command detectstructure
is used to detect if there is any
structure to be exploited during the decomposition (see ExploitStructure
option). If structure is detected, the dense tensor is converted to its
efficient representation. If T
is the efficient representation of a
sufficiently small structured tensor, T
is expanded to allow the use of
better compression and initialization algorithms (see ExpandLimit
option).
Instead of letting ll1(T,L)
compute an initialization, the initialization
can also be provided:
L = [2 3];
Ucpd = ll1_rnd([10 11 12], L, 'OutputFormat', 'cpd');
Ubtd = ll1_rnd([10 11 12], L, 'OutputFormat', 'btd');
Uhat = ll1(T, Ucpd, L);
Uhat = ll1(T, Ubtd, L);
Uhat = ll1(T, Ubtd); % hence L is optional for the BTD format
Note that ll1_gevd
is a deterministic algorithm, unless slices are mixed
randomly (see help ll1_gevd
). Hence, if this algorithm is used for
initialization in ll1
, the result is always the same for multiple runs for
the same \(\ten{T}\) and \(L\), and for the same options. Changing the
initialization method to, for instance, ll1_rnd
, is useful when different
initializations are needed. The following example uses the ll1_gevd
algorithm as initialization method for the first run, and uses a random
initialization (ll1_rnd
) for the other runs:
init = @ll1_gevd;
results = cell(1, 5);
for n = 1:5
results{n} = ll1(T, L, 'Initialization', init);
init = @ll1_rnd;
end
The next subsection explains different options of the ll1
routine.
Setting the options¶
The different steps in ll1
are customizable by supplying the method with an
options structure (see help ll1
for more information). Options can be
provided using an options
structure, or using key-value pairs, e.g.,
% Using options struct
options = struct;
options.Display = true; % Show progress on the command line
options.Initialization = @ll1_rnd; % Take ll1_rnd as initialization method
options.Complex = true; % Use complex initialization even for real tensor
ll1(T, L, options);
% Using key-value pairs
ll1(T, L, 'Display', 10, 'Initialization', @ll1_rnd, 'complex', true);
When key-value pairs are used, the options are case insensitive. The options
InitializationOptions
, AlgorithmOptions
and RefinementOptions
can be
used to pass options (as a structure) to the different algorithms. To set the
maximum number of iterations for the refinement step, for example, we can use
refinementOptions = struct;
refinementOptions.MaxIter = 100;
Uhat = ll1(T, L, 'Refinement', @ll1_minf, 'RefinementOptions', refinementOptions);
These structures are passed to the algorithms corresponding to the
initialization, algorithm or refinement step. For example, ll1
calls
ll1_minf
in the example above as ll1_minf(T, options.Initialization(T,L),
L,options.RefinementOptions)
in which options.Initialization
is the
default initialization method.
Since Tensorlab 3.0 some frequently used options can be passed automatically.
For example ll1(T, L, 'Display', 10)
automatically passes the
Display = 10
option to AlgorithmOptions
and RefinementOptions
, unless these
option structures define the Display
option. Similarly, the MaxIter
,
TolX
, TolFun
and CGMaxIter
options are passed automatically.
Viewing the algorithm output¶
Each step may also output additional information specific to that step. For
instance, most algorithms to decompose a tensor into multilinear
rank-\((L_r,L_r,1)\) terms such as ll1_nls
keep track of the number of
iterations and objective function value. To obtain this information, capture the
second output argument of ll1
:
L = [2 3];
U = ll1_rnd([10 11 12], L);
T = ll1gen(U);
[Uhat,output] = ll1(T,L);
and inspect its fields, for example by plotting the objective function value:
semilogy(0:output.Algorithm.iterations,sqrt(2*output.Algorithm.fval));
xlabel('iteration');
ylabel('frob(ll1res(T,U))');
grid on;
The output
structure has at least four fields: Compression
,
Initialization
, Algorithm
and Refinement
. Each field can be
inspected to see which algorithm is called, e.g., in the example above,
output.Compression.Name
is mlsvd_rsi
, or what the result is, e.g., in
the example above the compression ratio is output.Compression.ratio = 0.0379
.
If the Display
option is set to true
or to a number greater than zero, a
number of messages are printed, depending on the steps taken to provide the
result. As an example, consider the following code:
U = ll1_rnd([10 11 12], [3 2]);
T = noisy(ll1gen(U), 40);
Uhat = ll1(T, [3 2], 'Display', 1);
prints the following messages
Step 1: Compression is mlsvd_rsi (compression ratio 0.0378788 < 0.5)...
relative error = 0.00955528.
Step 2: Initialization is ll1_gevd (sum(sum(L) <= size(T)) >= 2)...
relative error = 0.00160962.
Step 3: Algorithm is ll1_nls on the compressed tensor... iterations = 3,
relative error = 0.000923906.
Step 4: Refinement is ll1_nls on the uncompressed tensor:
| Step 4a: Detecting structure... no structure detected.
| Step 4b: ll1_nls... iterations = 2, relative error = 0.00959971.
The output shows that the tensor is first compressed using mlsvd_rsi
. The
compression causes an approximation error. The ll1_gevd
then computes a
first factorization of the compressed tensor using the GEVD method. This result
is used as initial solution for the ll1_nls
optimization routine. As can be
seen, the relative error is decreased in three iterations. In the refinement
step, the original, uncompressed tensor T
is decomposed, starting from the
result of step 3. The refinement step consists of several substeps. In step 4a,
the algorithm tries to detect if structure is present in the tensor. If
structure is detected, this is exploited in the refinement algorithm. In this
case, no structure is detected and the original tensor is used. Refinement is
only performed if compression is used in step 1. If no compression is performed,
the structure detection takes place in step 3.
If the efficient representation of a structured tensor is used, additional substeps can be performed. For example:
U = ll1_rnd([10 11 12], [3 2]);
Uhat = ll1(U, [3 2], 'Display', 1); % use U as structured tensor
Step 1: Compression is mlsvd_rsi (compression ratio 0.0378788 < 0.5):
| Step 1a: Structured tensor expanded to full tensor
(prod(getsize(T)) <= 1e+06).
| Step 1b: mlsvd_rsi on full tensor... relative error = 8.65928e-16.
Step 2: Initialization is ll1_gevd (sum(sum(L) <= size(T)) >= 2)...
relative error = 1.96691e-15.
Step 3: Algorithm is ll1_nls on the compressed tensor... iterations = 1,
relative error = 1.73667e-16.
Step 4: Refinement skipped. (relative error <= 1e-08)
As can be seen from the output, an additional preprocessing step (step 1a) is
introduced. If a structured tensor is small enough, i.e., if its number of
elements is smaller than ExpandLimit
(default: \(10^6\)), the efficient
representation is expanded to a full tensor for the compression and/or
initialization steps as this allows the use of better algorithms. If the number
of entries in the tensor is larger than ExpandLimit
, expanding the tensor is
too expensive, and other algorithms are selected.
Computing the error¶
The residual between a tensor T
and its decomposition in multilinear
rank-\((L_r,L_r,1)\) terms given by Uhat
can be computed with
res = ll1res(T, Uhat); % Uhat in BTD format
res = ll1res(T, Uhat, L); % Uhat in CPD or BTD format
If the tensor is dense, then the result is equivalent to ll1gen(Uhat)-T
or
ll1gen(Uhat,L)-T
. The relative error between the tensor and its CPD
approximation can then be computed as
relerr = frob(ll1res(T,Uhat))/frob(T); % Uhat in BTD format
relerr = frob(ll1res(T,Uhat,L))/frob(T); % Uhat in CPD or BTD format
In this approach, efficient representations of structured tensors are expanded
to full tensors using ful(T)
. The methods frobll1res(T,Ubtd)
and
frobll1res(T,Ucpd,L)
compute the Frobenius norm of the residual without
expanding representations of large structured tensors (if
prod(getsize(T)) > ExpandLimit
, see help frobll1res
for more
information). Note that a loss of accuracy can occur if the expanded tensor
That = ll1gen(Uhat,L)
is almost equal to T
in relative sense (see note at
the end of this chapter).
Overview of algorithms¶
The ll1
method uses a variety of algorithms to compute the decomposition
in a multi-step initialization sequence. These algorithms can be called
separately as well. Below an overview of the different available algorithms is
given, as well as an overview of other convenient algorithms.
The following algorithms can be used to compute the decomposition in rank-\((L_r,L_r,1)\) terms of a full, sparse, incomplete or structured tensor.
ll1(T, L)
computes the decomposition in multilinear rank-\((L_r,L_r,1)\) terms of a tensorT
using a multistep approach.ll1_nls(T, Ubtd)
andll1_nls(T, Ucpd, L)
compute the decomposition in multilinear rank-\((L_r,L_r,1)\) terms of a tensorT
using an NLS type algorithm with an initial guess in the BTD or CPD format.ll1_minf(T, Ubtd)
andll1_minf(T, Ucpd, L)
compute the decomposition in multilinear rank-\((L_r,L_r,1)\) terms of a tensorT
using a nonlinear unconstrained optimization type algorithm with an initial guess in the BTD or CPD format.ll1_gevd(T, L)
computes the decomposition in multilinear rank-\((L_r,L_r,1)\) terms using the generalized eigenvalue decomposition [27].ll1_gevd
only accepts full and sparse tensors. For efficient representations of structured tensorsll1_gevd(ful(T),L)
can be used.ll1_gevd
often provides a good initialization for the optimization-based algorithms.
The following routines can be used to generate random tensors with structure, to generate full tensors and to compute (Frobenius norms of) residuals.
ll1_rnd(size_tens, L)
generates random factorsUbtd
orUcpd
for a decomposition in multilinear rank-\((L_r,L_r,1)\) terms.ll1convert(U)
andll1convert(U, L)
convert BTD format to CPD format and vice versa.ll1gen(U)
andll1gen(U,L)
generate the full tensor from the decomposition in multilinear rank-\((L_r,L_r,1)\) terms given in the BTD or CPD format.ll1res(T, U)
andll1res(T, U, L)
compute the residualll1gen(U) - T
.frobll1res(T, U)
andfrobll1res(T, U, L)
compute the Frobenius norm of the residualll1gen(U) - T
.
All algorithms that have factors U
as an output argument, have an option
OutputFormat
which can be set to btd
or cpd
to get the desired
output format.
Note
When using structured tensors, the expected numerical precision is only half
the machine precision. This is due to the mathematical formulation
implemented to exploit the structure. In most cases, this is not a problem as
the maximal attainable precision is already limited due to noise and model
errors. Only in the very specific case when a solution accurate up to machine
precision can be found and is required, this can be an issue. (This requires
TolFun
and TolX
to be eps^2
and eps
, respectively.) A
solution is to use the structured tensor T
to compute a solution Ures
accurate up to half the machine precision, and then refine this solution
using ll1_nls(ful(T), Ures, 'TolFun', eps^2, 'TolX', eps)
for a few
iterations. (It can be useful to set MaxIter
to 2
or 3
.) This
technique requires only very few expensive iterations involving the full
tensor.
Further reading¶
The algorithms outlined in this chapter can be used to compute unconstrained decompositions in multilinear rank-\((L_r,L_r,1)\) terms. In the structured data fusion (SDF) framework, constraints can be imposed as well, and coupled decompositions can be defined. See Structured data fusion for more details.