Structured data fusion¶
Structured data fusion or SDF [36] comprises a modeling language, a
syntax checker and a number of solvers to jointly factorize one or more
datasets, impose symmetry in the factorization, and optionally apply structure
on the different factors. Each data set — stored as a dense, sparse,
incomplete or structured tensor, cf. Section Datasets: dense, incomplete, sparse and structured — in a data
fusion problem can be factorized with a different tensor decomposition.
Currently, the user has the choice of the CPD, LL1, LMLRA and BTD models, as
well as L0, L1 and L2 regularization terms. Structure can be imposed on the
factors in a modular way and the user can choose from a library of predefined
structures such as nonnegativity, orthogonality, Hankel, Toeplitz, Vandermonde,
matrix inverse, and many more. See Contents.m
for a complete list. By
selecting the right structures you can even compute classical matrix
decompositions such as the QR factorization, eigenvalue decomposition and
singular value decomposition.
This section gives a basic introduction to SDF by explaining how it works and how different models can be constructed. In Structured data fusion: examples specific examples are given. After that more advanced concepts are introduced in Structured data fusion: advanced concepts which enable the creation of more dynamical models. Structured data fusion: implementing a transformation explains how a new transformation can be defined to impose a new kind of structure on a factor. Finally, a more technical specification of the modeling language is given in Structured data fusion: specification of domain specific language.
In this section the following topics are handled:
- Introducing SDF: elements of a data fusion model (Elements of an SDF model)
- Using
sdf_check
to check the syntax (Checking and solving a model) - Using different solvers (Checking and solving a model)
- Coupling datasets and setting weights (Imposing coupling and symmetry)
- Imposing symmetry (Imposing coupling and symmetry)
- Imposing structure on factors (Imposing structure on factors)
- Using constant factors and subfactors (Imposing structure on factors)
- The different factorization and regularization types (Factorization and regularization types)
- The accepted data types including full, sparse, incomplete and structured tensors (Factorization and regularization types)
Elements of an SDF model¶
The structured data fusion framework uses a domain specific language allowing one to express the coupling between datasets and the structure of different factors easily. Figure 39 illustrates the rationale behind the language by showing how an SDF model is built. Structure can be imposed on factors by transforming variables \(z\) into factors \(x\). The resulting factors can then be used in factorizations of different datasets. Coupling between the datasets is expressed by using one or more common factors in the different factorizations. Similarly, symmetry can be imposed by using a factor multiple times in the same factorization.
Hence, there are three important elements in an SDF model: variables
,
factors
and factorizations
. To compute a rank-\(R\) CPD of a
third-order tensor \(\ten{T}\) for example, the following code can be used:
R = 5;
T = ful(cpd_rnd([10 11 12], R));
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,2), R);
model.variables.c = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'C'};
It is advised to write model = struct;
before a model is constructed in
order to be sure that no elements from a previous model are used. In this model,
the value provided to a variable is the initial guess for that particular
variable. The line model.factors.A
simply states that a factor with the name
A
is created, and it is equal to the variable a
. The 'a'
is called a
(named) reference. Similarly, a (named) factorization tensor
is created,
which factors the tensor T
in the data
field as a cpd
with factors
A
, B
and C
. A factorization can only use references to factors
(unless implicit factors are enabled, see Structured data fusion: advanced concepts). Apart from
the CPD, other factorizations can be used as well, e.g., btd
, ll1
,
lmlra
, regL0
, regL1
and regL2
(see Factorization and regularization types).
Figure 39 shows a typical case of structured data fusion. Two datasets \(\ten{T}^{(1)}\) and \(\ten{T}^{(2)}\) are jointly factorized. Both datasets share the factor matrices \(x_2\) and \(x_3\). Each factor matrix has a different structure. Factor matrix \(x_1\) has a Toeplitz structure, \(x_2\) has orthonormal columns and \(x_3\) is nonnegative. In the SDF model, every structure is achieved by transforming some variables into factors. The first factor \(x_1\) is created by transforming a vector \(z_1\) into the Toeplitz matrix. A (vectorized) upper triangular matrix with the parameters for Householder reflectors \(z_2\) can be transformed into factor \(x_2\). Finally, squaring the variables in \(z_3\) results in a nonnegative factor \(x_3\).
For an overview of the different transformations or structures that can be
applied, we refer to help Contents.m
. All functions starting with
struct_
are transformations.
Checking and solving a model¶
Starting from Tensorlab 3.0 an overview of the factors and factorizations in an
SDF model can be printed, after interpretation by the language parser. This way,
the different transformations can be inspected, and the user can see if the
model is interpreted as he or she intended. The language parser, or syntax and
consistency checker, is called sdf_check
. This method interprets the model
and converts it to the internal format. If a syntax error is found, e.g.,
because no variables are defined, or because a reference to a non-existing
factor is made, an informative error is thrown. For example:
R = 5;
T = ful(cpd_rnd([10 11 12], R));
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,2), R);
model.variables.c = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'D'};
sdf_check(model);
Error in line model.factorizations.tensor.cpd:
Factorization tensor (id = 1) depends on the unknown factor D
In many cases, some parts of the error message are underlined, indicating that
this is clickable (only available in the Matlab Command Window). For example,
clicking on model.factorizations.tensor.cpd
indeed gives the content of the
line containing the error:
ans =
'A' 'B' 'D'
If a consistency error is found, an error is thrown (using sdf_check(model)
),
or a message is printed in the model overview if the 'print'
option is
provided (sdf_check(model, 'print')
). Examples of consistency errors are
that the number of columns in the factor matrices is not the same in the case of
a CPD, or that the size of the tensor generated by the factors does not match
the size of the data. Consider the following model:
R = 5;
T = ful(cpd_rnd([10 11 12], R));
model = struct;
model.variables.a = randn(11, R);
model.variables.b = randn(10, R);
model.variables.c = randn(12, R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'C'};
sdf_check(model,'print');
Factor Size Comment
------------------------------------------------------------------------
A [11x5]
B [10x5]
C [12x5]
Factorization Type Factors Comments
------------------------------------------------------------------------
tensor (id = 1) cpd A,B,C Dimension mismatch
The comment for the factorization states Dimension mismatch
. In the Matlab
Command Window, this is a link and when clicked, the following error is shown:
Error in line model.factorizations.tensor.cpd:
Factorization tensor (id = 1): the size of the tensor generated by the
factors [11x10x12] should match the size of the data [10x11x12]
This is indeed the problem. For more complicated models, this often helps
locating possible errors. If sdf_check(model, 'print')
does not give an
error, the available solvers in Tensorlab should run without problems. If there
is a syntax or consistency error, the solvers will not run, as the model is
invalid (sdf_check
is called internally as well).
When a model is valid, i.e., no syntax or consistency errors are found, it can be computed using one of the following solvers:
sdf_minf
orsdf_nls
: a general purpose solver for SDF models.ccpd_minf
orccpd_nls
: a specialized solver for symmetric and/or coupled CPDs only, in which no structure, constants or concatenations are allowed. These solvers are recommended when they can be used.
The solvers can be called as follows:
R = 5;
T = ful(cpd_rnd([10 11 12], R));
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,2), R);
model.variables.c = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'C'};
sdf_check(model, 'print'); % recommended
sol = sdf_nls(model);
The solution is a struct containing a variables
field and a factors
field with the results:
sol.variables
ans =
a: [10x5 double]
b: [11x5 double]
c: [12x5 double]
sol.factors
ans =
A: [10x5 double]
B: [11x5 double]
C: [12x5 double]
The ccpd_minf
and ccpd_nls
solvers use a different output format for the
solution. As there are no transformations on the variables, only the factors are
given as output as a cell, e.g., for the model above:
Ures = ccpd_nls(model)
Ures =
[10x5 double] [11x5 double] [12x5 double]
Similar to the other optimization routines, extra options can be provided, and extra information on the convergence of the algorithm can be inspected using:
% Using key-value pairs
[sol, output] = sdf_nls(model, 'Display', 10, 'MaxIter', 200);
% Using an options structure
options = struct;
options.Display = 10;
options.MaxIter = 200;
[sol, output] = sdf_nls(model, options);
The output
contains convergence information, how many iterations are carried
out, etc. The output
also contains the absolute and relative errors for each
factorization in output.abserr
and output.relerr
. The errors are
computed as the Frobenius norm of the difference between the input data and the
tensor generated by the model, e.g., for the example above it is
output.abserr = frob(T - cpdgen(sol.factors.A, sol.factors.B, sol.factors.C))
.
Imposing coupling and symmetry¶
The data fusion part in Structured Data Fusion means that different datasets can be factorized or decomposed together while sharing factors or underlying variables. In this section a basic coupling between datasets is discussed, followed by a discussion on how different weights (or hyperparameters) can be set. Finally, symmetric decompositions are discussed.
Note
The following examples are meant to explain the different modeling techniques. Actually solving the models may not yield a good solution, as the examples do not necessarily reflect good numerical choices and the initialization strategies are not discussed.
Coupling datasets¶
Consider the following coupled tensor matrix factorization problem
Note that \(\ten{M}_{\text{CPD}}(\mat{C},\mat{D}) = \mat{C}\mat{D}^{\T}\). This can be modeled easily using SDF by defining two factorizations that have a factor in common:
T = randn(10,11,12);
M = randn(12,13);
R = 4;
model = struct;
model.variables.a = randn(size(T,1),R);
model.variables.b = randn(size(T,2),R);
model.variables.c = randn(size(T,3),R);
model.variables.d = randn(size(M,2),R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factors.D = 'd';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'C'};
model.factorizations.matrix.data = M
model.factorizations.matrix.cpd = {'C', 'D'};
sdf_check(model, 'print');
As can be seen below, the two factorizations indeed share the common factor matrix \(\mat{C}\):
Factor Size Comment
------------------------------------------------------------------------
A [10x4]
B [11x4]
C [12x4]
D [13x4]
Factorization Type Factors Comments
------------------------------------------------------------------------
tensor (id = 1) cpd A,B,C
matrix (id = 2) cpd C,D
Absolute and relative weights¶
Often different weights are needed for the different factorization terms, i.e., the problem
is solved instead. Using SDF the weights \(\lambda_1\) and
\(\lambda_2\) can be given as absolute weights and as relative weights.
Absolute weights can be provided using the weight
field:
% variables and factors omitted
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'C'};
model.factorizations.tensor.weight = lambda1;
model.factorizations.matrix.data = M
model.factorizations.matrix.cpd = {'C', 'D'};
model.factorizations.matrix.weight = lambda2;
Note that the factor \(\frac{1}{2}\) in the objective function is
present for implementation purposes. Therefore lambda1
is equal to
\(\lambda_1\) and not \(\frac{\lambda_1}{2}\) . To get a term
\(\norm{\ten{T} - \ten{M}_{\text{CPD}}(\mat{A}, \mat{B},
\mat{C})}_{\text{F}}^2\), \(\lambda_1\) should be set to 2.
Often, it is easier to use relative weights. The relative weights \(\rho_1\) and \(\rho_2\) can be given as
% variables and factors omitted
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'C'};
model.factorizations.tensor.relweight = rho1;
model.factorizations.matrix.data = M
model.factorizations.matrix.cpd = {'C', 'D'};
model.factorizations.matrix.relweight = rho2;
Relative weights \(\rho_i\) are automatically translated to absolute weights using the following expression:
in which \(N_i\) is the number of known entries in tensor Ti
, i.e.,
Ni = prod(getsize(Ti))
unless Ti
is incomplete, then
Ni = length(Ti.val)
. For regularization terms \(N_i = \sum_f
N_i^{(f)}\) in which \(N_i^{(f)}\) is the number of entries in factor
\(f\) after expansion. For example:
model = struct;
model.variables.a = rand(10, 5);
model.variables.b = rand(1, 5);
model.factors.A = 'a';
model.factors.B = {'b', @struct_diag};
model.factorizations.reg.regL1 = {'A', 'B'};
Factors A
and B
are regularized. Factor A
has size
\(10\times5\). Therefore, \(N_1^{(A)} = 50\). After expansion, factor
B
has size \(5\times5\) (see sdf_check(model, 'print')
). Hence,
\(N_1^{(B)} = 25\). Together we have \(N_1 = 75\).
Absolute and relative weights cannot be mixed, i.e., either all factorizations
define a weight
field (or no field), or all factorizations define a
relweight
field (or no field). Omitted (relative) weights are assumed to
be 1. If absolute nor relative weights are given, relative weights equal to 1
are assumed.
Imposing symmetry¶
Finally, some special attention is given to decompositions involving symmetry. Symmetry can easily be imposed by using the same factor multiple times in a factorization. For example, consider a fourth-order tensor \(\ten{T}\) that is symmetric in the first two modes and in the last two modes:
This can be implemented as:
R = 5;
% create symmetric tensor
U0 = cpd_rnd([10 11], R);
U0 = U0([1 1 2 2]);
T = cpdgen(U0);
% Create model
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factorizations.symm.data = T;
model.factorizations.symm.cpd = {'A', 'A', 'B', 'B'};
Often when imposing symmetry, especially symmetry in all modes, it is useful to give each rank-1 term a weight \(\alpha_r\):
This weight \(\alpha_r\) allows some extra freedom. For example, in the case
of a fourth-order tensor and a factorization
\(\ten{M}_{\text{CPD}}(\mat{A},\mat{A},\mat{A},\mat{A})\), the diagonal
entries of the result are always nonnegative, while it is perfectly possible to
have a fully symmetric tensor with negative diagonal entries. Another example
imposes orthogonality structure (struct_orth
) on all factors \(\mat{A}\)
in \(\ten{M}_{\text{CPD}}(\mat{A},\mat{A},\mat{A})\). As struct_orth
normalizes each column to have norm one, each rank-1 term also has norm one,
which can result in a bad fit. In these two examples, adding the weight
\(\alpha_r\) is important to obtain a meaningful fit. Adding the weights
\(\alpha_r\) can easily be done in Tensorlab by creating an additional
factor matrix of size \(1\times R\), i.e., a row vector:
R = 5;
% create symmetric tensor
U0 = cpd_rnd([10 11], R);
U0 = U0([1 1 2 2]);
T = cpdgen(U0);
% Create model
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,3), R);
model.variables.c = randn(1, R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.symm.data = T;
model.factorizations.symm.cpd = {'A', 'A', 'B', 'B', 'C'};
This is a valid model, as can be verified using sdf_check(model, 'print')
.
The reason is that \(\ten{T}\) can be seen as a tensor of size
\(10\times10\times11\times11\times1\).
Imposing structure on factors¶
Structure by transformation of variables¶
Using the SDF language, it is easy to create structured factors. The way to do so, is by transforming a variable into a factor using some transformation. Some examples:
- Nonnegativity can be imposed on a factor by squaring underlying variables
(
struct_nonneg
) [37]. - A factor matrix in which each column is a polynomial evaluated in some points
can be modeled by the coefficients of these polynomials. These coefficients
are the variables (
struct_poly
). - Orthogonality can be imposed using Householder reflectors as underlying
variables (
struct_orth
).
See help Contents.m
to get an overview of all implemented transformations.
All functions starting with struct_
are transformations. New transformations
can be defined as well (see Structured data fusion: implementing a transformation).
In Tensorlab, structure can be imposed on factors as follows. For example, to
impose nonnegativity on factors A
and B
the following model can be
used:
M = rand(10,10);
model = struct;
model.variables.a = randn(10,3);
model.variables.b = randn(10,3);
model.factors.A = {'a', @struct_nonneg};
model.factors.B = {'b', @struct_nonneg};
model.factorizations.matrix.data = M;
model.factorizations.matrix.cpd = {'A', 'B'};
sdf_check(model, 'print');
Factor Size Comment
------------------------------------------------------------------------
A [10x3]
B [10x3]
Factorization Type Factors Comments
------------------------------------------------------------------------
matrix (id = 1) cpd A,B
In the Matlab Command Window, A
and B
can be clicked to get the
different transformations applied to the factors, e.g. for A
:
Expansion of factor A (id = 1) (view):
(1,1) a [10x3] struct_nonneg Factor [10x3]
When clicking a
and Factor
we see indeed that every number in a
is
squared:
ans =
0.1383 0.9000 -0.6235
-0.3784 -1.5312 -1.3501
0.1431 0.5046 -1.1622
1.6050 -0.8642 -0.9443
1.3491 -0.3766 -0.6712
-0.4484 0.7880 0.5767
0.1684 0.2982 -2.0858
-1.1205 -0.1637 0.2360
0.4007 0.6067 -0.7784
0.7391 1.6345 1.0996
ans =
0.0191 0.8101 0.3887
0.1432 2.3446 1.8228
0.0205 0.2546 1.3508
2.5761 0.7469 0.8917
1.8201 0.1418 0.4505
0.2011 0.6210 0.3326
0.0284 0.0889 4.3504
1.2555 0.0268 0.0557
0.1606 0.3681 0.6059
0.5462 2.6716 1.2090
For some transformations extra parameters are required. For example, to model a
factor matrix in which each column is a polynomial, struct_poly
can be used.
struct_poly
needs the evaluation points \(t_i\) for these polynomials as
extra parameter. In the example below, each column of factor A
is a
polynomial of maximal degree \(d\):
M = rand(11,10);
t = 0:0.1:1; % evaluation points
d = 4; % degree
R = 3; % rank
model = struct;
model.variables.c = randn(R,d+1); % note that c has R rows!
model.variables.b = randn(10,R);
model.factors.A = {'c', @(z,task) struct_poly(z,task,t)};
model.factors.B = {'b', @struct_nonneg};
model.factorizations.matrix.data = M;
model.factorizations.matrix.cpd = {'A', 'B'};
sdf_check(model, 'print');
Factor Size Comment
------------------------------------------------------------------------
A [11x3]
B [10x3]
Factorization Type Factors Comments
------------------------------------------------------------------------
matrix (id = 1) cpd A,B
As can be seen, A
is an \(11\times 3\) matrix. If A
is clicked, the
effect of the transformation can be seen:
Expansion of factor A (id = 1) (view):
(1,1) c [3x5] struct_poly Factor [11x3]
Factor
can be clicked again to get the resulting factor. Note that when a
link, e.g, Factor
or view
, is clicked, the result becomes available as
ans
. Hence, the resulting factor A
can be plotted using
A = ans;
plot(t, A);
Finally, different transformations can be chained. For example:
M = rand(11,10);
t = 0:0.1:1; % evaluation points
d = 4; % degree
R = 3; % rank
model = struct;
model.variables.c = randn(R,d+1); % note that c has R rows!
model.variables.b = randn(10,R);
model.factors.A = {'c', @(z,task) struct_poly(z,task,t), @struct_nonneg};
model.factors.B = {'b', @struct_nonneg};
model.factorizations.matrix.data = M;
model.factorizations.matrix.cpd = {'A', 'B'};
sdf_check(model, 'print');
Clicking A
again, allows us to view the individual transformations:
Expansion of factor A (id = 1) (view):
(1,1) c [3x5] struct_poly temp [11x3]
temp [11x3] struct_nonneg Factor [11x3]
Any number of transformations can be chained, as long as the final result is a valid factor:
M = rand(11,10);
t = 0:0.1:1; % evaluation points
d = 4; % degree
R = 5; % rank
diagonal = @struct_diag;
poly = @(z,task) struct_poly(z,task,t);
nonneg = @struct_nonneg;
negate = @(z,task) struct_matvec(z, task, [], -eye(R));
model = struct;
model.variables.c = randn(1,d+1); % note that c has R rows!
model.variables.b = randn(10,R);
model.factors.A = {'c', diagonal, poly, nonneg, negate};
model.factors.B = {'b', nonneg};
model.factorizations.matrix.data = M;
model.factorizations.matrix.cpd = {'A', 'B'};
sdf_check(model, 'print');
This example creates a factor A
in which each column \(r\) is negative
polynomial of the form \(-(t^{r-1})^2\). For readability the shorter names
for the transformations have been created first.
Constant factors¶
Constant factors can be defined as follows. Consider the following CPD in which the second factor matrix is kept constant:
T = randn(10, 11, 12);
R = 5;
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.c = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = rand(size(T,2), R);
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'C'};
sdf_check(model, 'print');
Factor Size Comment
------------------------------------------------------------------------
A [10x5]
B [11x5] Constant
C [12x5]
Factorization Type Factors Comments
------------------------------------------------------------------------
tensor (id = 1) cpd A,B,C
The factor B
is indeed recognized as constant. Constants follow the same
rules as variables. This means that constants can be cells or numerical arrays of
arbitrary dimensions (except a scalar, see Structured data fusion: specification of domain specific language), and
that constants can be transformed.
Subfactors explains how partially constant factors can be
constructed. Here, we only consider a special case in which specific entries are
constant. As an example, let the strictly upper triangular part of one factor
matrix, say A
, be fixed to zero. The following initial value is created:
A = randn(10,5);
[i,j] = ndgrid(1:10,1:5);
A(j > i) = 0
A =
2.0684 0 0 0 0
0.1294 -0.1904 0 0 0
0.5001 -0.1355 -0.3705 0 0
1.9050 0.8140 -0.3259 -0.5112 0
0.5555 1.0007 1.6666 -0.4472 -1.8578
-0.4579 0.2320 1.2689 -1.6985 -0.7134
-0.1454 -0.0285 -1.0390 -0.8552 0.7025
-0.2466 -0.9286 -0.8979 1.6053 -0.0543
-0.3119 -1.6001 0.7471 1.5946 -0.2169
1.0205 -0.7063 -1.3862 1.1280 -1.6644
To keep the strictly upper triangular part constant, the transformation
struct_const
can be used using the additional mask
argument (see
help struct_const
). This mask is true except for the entries in the strictly
upper triangular part:
mask = true(size(A));
mask(i > j) = false;
The model can now be created as:
T = randn(10, 11, 12);
model = struct;
model.variables.a = A;
model.variables.b = randn(11, 5);
model.variables.c = randn(12, 5);
model.factors.A = {'a', @(z,task) struct_const(z,task,mask)};
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'C'};
The tensor decomposition is now computed in the same way as in the unconstrained
case, with the exception that the derivatives of the entries in the strictly
upper triangular part of A
are explicitly set to zero. The entries in the
strictly upper triangular part therefore stay the same as in the initial value.
Subfactors¶
In the previous examples, we have assumed that every factor consists of a single subfactor. Factors may however consist of an arbitrary number of subfactors. Consider the following example in which one factor has a block-diagonal structure:
model = struct;
model.variables.a11 = randn(5,2);
model.variables.a22 = randn(5,3);
model.variables.b = randn(11,5);
model.variables.c = randn(12,5);
model.factors.A = {'a11', zeros(5,3); zeros(5,2), 'a22'};
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd = {'A', 'B', 'C'};
sdf_check(model, 'print');
Factor Size Comment
------------------------------------------------------------------------
A [10x5] Partially constant
B [11x5]
C [12x5]
Factorization Type Factors Comments
------------------------------------------------------------------------
tensor (id = 1) cpd A,B,C
The factor A
is partially constant as the off-diagonal blocks do not depend
on any variable. When clicking on A
in the Matlab Command Window, an
overview for this factor matrix is printed:
Expansion of factor A (id = 1) with structure [2x2] (view):
(1,1) a11 [5x2]
(2,1) (Constant) [5x2]
(1,2) (Constant) [5x3]
(2,2) a22 [5x3]
This overview correctly states that A
consists of 2x2
subfactors. Each
entry in this \(2\times2\) structure can be accessed individually as before.
It is also possible to impose structure on a subfactor, e.g., nonnegativity can be
imposed on block a11
as follows:
model.factors.A = { {'a11', @struct_nonneg}, zeros(5,3);
zeros(5,2), 'a22' };
Running sdf_check(model, 'print')
again, and clicking on A
gives the
expected output:
Expansion of factor A (id = 1) with structure [2x2] (view):
(1,1) a11 [5x2] struct_nonneg Factor [5x2]
(2,1) (Constant) [5x2]
(1,2) (Constant) [5x3]
(2,2) a22 [5x3]
In the previous examples, all factors are actually matrices or vectors. A factor can be a tensor as well, and it is also possible to concatenate subfactors along modes higher than two, for example:
model = struct;
model.variables.u = randn(10,2);
model.variables.v = randn(11,3);
model.variables.w = randn(12,3);
model.variables.s1 = randn(2,3,2);
model.variables.s2 = randn(2,3,1);
model.factors.U = 'u';
model.factors.V = 'v';
model.factors.W = 'w';
model.factors.S = reshape({'s1', 's2'}, 1, 1, 2);
model.factorizations.tensor.data = T;
model.factorizations.tensor.lmlra = {'U','V','W','S'};
sdf_check(model, 'print');
In the expansion of S
(click on S
), we see that the parts s1
and
s2
are indeed concatenated in the third mode (recall that matrix s2
is
a \(2\times3\times1\) tensor):
Expansion of factor S (id = 4) with structure [1x1x2] (view):
(1,1,1) s1 [2x3x2]
(1,1,2) s2 [2x3]
Factorization and regularization types¶
Currently, the SDF language supports four different factorization types (CPD, BTD, LMLRA and the decomposition in multilinear rank-\((L_r,L_r,1)\) terms) and three regularization types (L0, L1 and L2 regularization). The different models and factorization types are explained in their respective sections: Canonical polyadic decomposition, Block term decomposition, Multilinear singular value decomposition and low multilinear rank approximation and Decomposition in multilinear rank- terms. In the remaining part of this section we specify the format for the different factorization types:
CPD: a (canonical) polyadic decomposition of a tensor \(\ten{T}\) with factor matrices \(\mat{A}\), \(\mat{B}\), \(\mat{C}\), ... can be computed using
model.factorizations.mycpd.data = T; model.factorizations.mycpd.cpd = {'A', 'B', 'C', ...};
In the case \(\ten{T}\) is an incomplete tensor, specialized routines can be used if the
sdf_nls
solver is used. To activate these specialized routines usemodel.factorizations.mycpd.data = T; model.factorizations.mycpd.cpdi = {'A', 'B', 'C', ...};
LMLRA: a low multilinear rank approximation of a tensor \(\ten{T}\) with factor matrices \(\mat{A}\), \(\mat{B}\), \(\mat{C}\), ... and core tensor \(\ten{S}\) can be computed using
model.factorizations.mylmlra.data = T; model.factorizations.mylmlra.lmlra = {'A', 'B', 'C', ..., 'S'}; % or model.factorizations.mylmlra.lmlra = {{'A', 'B', 'C', ..., 'S'}};
BTD: an \(R\)-term block term decomposition of a tensor \(\ten{T}\) with factor matrices \(\mat{A}_r\), \(\mat{B}_r\), \(\mat{C}_r\), ... and core tensors \(\ten{S}_r\) for \(r = 1,\ldots,R\) can be computed using
model.factorizations.mybtd.data = T; model.factorizations.mybtd.btd = {{'A1', 'B1', 'C1', ..., 'S1'}, ..., {'AR', 'BR', 'CR', ..., 'SR'}};
LL1: a decomposition in multilinear rank-\((L_r,L_r,1)\) terms of a third-order tensor \(\ten{T}\) with factor matrices \(\mat{A}\), \(\mat{B}\), \(\mat{C}\) and
L = [L1 L2 ... LR]
can be computed usingmodel.factorizations.myll1.data = T; model.factorizations.myll1.ll1 = {'A', 'B', 'C'}; model.factorizations.myll1.L = L;
Note that the
L
field is obligatory in this case. Only the CPD format for a decomposition in multilinear rank-\((L_r,L_r,1)\) terms is accepted, see Decomposition in multilinear rank- terms.
In all cases above, both the data
and the field that specifies the
factorization type are obligatory.
Three regularization types, namely, L1 and L2 regularization (available since Tensorlab 2.0) and L0 regularization (available since Tensorlab 3.0), are currently available:
L1 regularization: a term \(\frac{1}{2}\|\vectorize(\mat{A})\|_1\) or \(\frac{1}{2}(\|\vectorize(\mat{A})\|_1 + \|\vectorize(\ten{B})\|_1 + ...)\) is added to factors, which can be matrices or tensors, using
model.factorizations.myreg.regL1 = 'A'; % or model.factorizations.myreg.regL1 = {'A', 'B', ...};
All factors mentioned in a single regularization term get the same weight (see Absolute and relative weights). If different factors should have different weights, multiple regularization terms have to be added, e.g.,
% A and B have the same weight model.factorizations.myreg1.regL1 = {'A', 'B'}; model.factorizations.myreg1.relweight = 1; % C gets a different weight model.factorizations.myreg2.regL1 = 'C'; model.factorizations.myreg2.relweight = 10;
By default, a term is regularized towards zero. A term \(\frac{1}{2}(\|\vec{0} - \vectorize(\mat{A})\|_1 + \|\vec{1} - \vectorize(\mat{B})\|_1)\) can be added using
model.factorizations.nonzeroreg.data = {zeros(size(A)), ones(size(B))}; model.factorizations.nonzeroreg.regL1 = {'A', 'B'};
L2 regularization: a term \(\frac{1}{2}\|\mat{A}\|_{\text{F}}^2\) or \(\frac{1}{2}(\|\mat{A}\|_{\text{F}}^2 + \|\ten{B}\|_{\text{F}}^2 + ...)\) is added to factors, which can be matrices or tensors, using
model.factorizations.myreg.regL2 = 'A'; % or model.factorizations.myreg.regL2 = {'A', 'B', ...};
The same remarks as for L1 regularization hold for L2 regularization as well.
L0 regularization: a term \(\|\vectorize(\mat{A})\|_{0}\) or \((\|\vectorize(\mat{A})\|_{0} + \|\vectorize(\ten{B})\|_{0} + ...)\) is added to factors, which can be matrices or tensors, using
model.factorizations.myreg.regL0 = 'A'; % or model.factorizations.myreg.regL0 = {'A', 'B', ...};
For computational reasons, a smoothed variant of the L0 norm is used: \(\|\vectorize(\mat{A})\|_0\) is approximated by \(\sum_{i=1}^{I} 1 - \text{exp}(-\frac{a_i^2}{\sigma^2})\) with \(I\) the number of entries in \(\mat{A}\). The parameter \(\sigma\) can be a scalar constant or a function of the iteration
k
and the cell of regularized factorsx
. The value can be set using thesigma
field:model.factorizations.myreg.regL0 = 'A'; model.factorizations.myreg.sigma = 0.01; % (the default), or model.factorizations.myreg.sigma = @(k,x) 0.01*frob(x{1}); % x is cell of factors
The same remarks as for L1 regularization hold for L0 regularization as well.
Similar to the other optimization-based decomposition algorithms, the SDF
algorithms sdf_minf
, sdf_nls
, ccpd_minf
and ccpd_nls
accept both
full, sparse, incomplete and structured tensors as data in the data
field.
Structured tensors can, for example, be given in the LMLRA format, the tensor
train format, the Hankel format or the Loewner format. See Datasets: dense, incomplete, sparse and structured for
more information. (Structured tensors are not to be confused with structure
related to factors.) For structured tensors the maximal achievable numerical
accuracy is only half the machine precision due to the way the structure is
exploited by the algorithm. This is usually not a problem as noise, inexact
models, regularization or chosen tolerances limit the attainable accuracy. If
accuracy up to machine precision is required, the accuracy of the result sol
of the decomposition of the structured tensor(s) can be improved using an
additional refinement step. This refinement step uses sol.variables
as
initial guess for the same problem, but all structured tensors are replaced by
their expanded form. For example, if T
is a structured tensor, we replace
model.factorization.fac.data = T;
by
model.variables = sol.variables;
model.factorization.fac.data = ful(T);
Then we run the solver again using the updated model. It is often useful to
limit the maximum number of iterations, e.g., by setting MaxIter
to 3
or
5
. In case a high accuracy is required, High accuracy for NLS algorithms may be of
interest.
Further reading¶
The remaining sections on structured data fusion handle other, more advanced
concepts of the SDF language. More concrete examples and useful tricks can be
found in Structured data fusion: examples. One section in the advanced chapter
(Structured data fusion: advanced concepts) is particularly useful:
Using indexed notations explains how the indexed notation can be used to
simplify models. This also allows the auto-wrapping feature to be used to its
fullest potential, and it makes it easier to create more dynamical models.
(Auto-wrapping basically means that most braces {}
are added automatically
in unambiguous cases.)