Structured data fusion

sdf

Figure 39: Schematic of structured data fusion. The vector \(z_1\), upper triangular matrix \(z_2\) and full matrix \(z_3\) are transformed into a Toeplitz, orthogonal and nonnegative matrix, respectively. The resulting factors are then used to jointly factorize two coupled data sets.

Structured data fusion or SDF [36] comprises a modeling language, a syntax checker and a number of solvers to jointly factorize one or more datasets, impose symmetry in the factorization, and optionally apply structure on the different factors. Each data set — stored as a dense, sparse, incomplete or structured tensor, cf. Section Datasets: dense, incomplete, sparse and structured — in a data fusion problem can be factorized with a different tensor decomposition. Currently, the user has the choice of the CPD, LL1, LMLRA and BTD models, as well as L0, L1 and L2 regularization terms. Structure can be imposed on the factors in a modular way and the user can choose from a library of predefined structures such as nonnegativity, orthogonality, Hankel, Toeplitz, Vandermonde, matrix inverse, and many more. See Contents.m for a complete list. By selecting the right structures you can even compute classical matrix decompositions such as the QR factorization, eigenvalue decomposition and singular value decomposition.

This section gives a basic introduction to SDF by explaining how it works and how different models can be constructed. In Structured data fusion: examples specific examples are given. After that more advanced concepts are introduced in Structured data fusion: advanced concepts which enable the creation of more dynamical models. Structured data fusion: implementing a transformation explains how a new transformation can be defined to impose a new kind of structure on a factor. Finally, a more technical specification of the modeling language is given in Structured data fusion: specification of domain specific language.

In this section the following topics are handled:

Elements of an SDF model

The structured data fusion framework uses a domain specific language allowing one to express the coupling between datasets and the structure of different factors easily. Figure 39 illustrates the rationale behind the language by showing how an SDF model is built. Structure can be imposed on factors by transforming variables \(z\) into factors \(x\). The resulting factors can then be used in factorizations of different datasets. Coupling between the datasets is expressed by using one or more common factors in the different factorizations. Similarly, symmetry can be imposed by using a factor multiple times in the same factorization.

Hence, there are three important elements in an SDF model: variables, factors and factorizations. To compute a rank-\(R\) CPD of a third-order tensor \(\ten{T}\) for example, the following code can be used:

R = 5;
T = ful(cpd_rnd([10 11 12], R));
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,2), R);
model.variables.c = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'C'};

It is advised to write model = struct; before a model is constructed in order to be sure that no elements from a previous model are used. In this model, the value provided to a variable is the initial guess for that particular variable. The line model.factors.A simply states that a factor with the name A is created, and it is equal to the variable a. The 'a' is called a (named) reference. Similarly, a (named) factorization tensor is created, which factors the tensor T in the data field as a cpd with factors A, B and C. A factorization can only use references to factors (unless implicit factors are enabled, see Structured data fusion: advanced concepts). Apart from the CPD, other factorizations can be used as well, e.g., btd, ll1, lmlra, regL0, regL1 and regL2 (see Factorization and regularization types).

Figure 39 shows a typical case of structured data fusion. Two datasets \(\ten{T}^{(1)}\) and \(\ten{T}^{(2)}\) are jointly factorized. Both datasets share the factor matrices \(x_2\) and \(x_3\). Each factor matrix has a different structure. Factor matrix \(x_1\) has a Toeplitz structure, \(x_2\) has orthonormal columns and \(x_3\) is nonnegative. In the SDF model, every structure is achieved by transforming some variables into factors. The first factor \(x_1\) is created by transforming a vector \(z_1\) into the Toeplitz matrix. A (vectorized) upper triangular matrix with the parameters for Householder reflectors \(z_2\) can be transformed into factor \(x_2\). Finally, squaring the variables in \(z_3\) results in a nonnegative factor \(x_3\).

For an overview of the different transformations or structures that can be applied, we refer to help Contents.m. All functions starting with struct_ are transformations.

Checking and solving a model

Starting from Tensorlab 3.0 an overview of the factors and factorizations in an SDF model can be printed, after interpretation by the language parser. This way, the different transformations can be inspected, and the user can see if the model is interpreted as he or she intended. The language parser, or syntax and consistency checker, is called sdf_check. This method interprets the model and converts it to the internal format. If a syntax error is found, e.g., because no variables are defined, or because a reference to a non-existing factor is made, an informative error is thrown. For example:

R = 5;
T = ful(cpd_rnd([10 11 12], R));
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,2), R);
model.variables.c = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'D'};

sdf_check(model);
dasfoijeoji
Error in line model.factorizations.tensor.cpd:

    Factorization tensor (id = 1) depends on the unknown factor D

In many cases, some parts of the error message are underlined, indicating that this is clickable (only available in the Matlab Command Window). For example, clicking on model.factorizations.tensor.cpd indeed gives the content of the line containing the error:

984we984we948
ans =
 'A'    'B'    'D'

If a consistency error is found, an error is thrown (using sdf_check(model)), or a message is printed in the model overview if the 'print' option is provided (sdf_check(model, 'print')). Examples of consistency errors are that the number of columns in the factor matrices is not the same in the case of a CPD, or that the size of the tensor generated by the factors does not match the size of the data. Consider the following model:

R = 5;
T = ful(cpd_rnd([10 11 12], R));
model = struct;
model.variables.a = randn(11, R);
model.variables.b = randn(10, R);
model.variables.c = randn(12, R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'C'};

sdf_check(model,'print');
we5sd15
Factor               Size                 Comment
------------------------------------------------------------------------
A                    [11x5]
B                    [10x5]
C                    [12x5]

Factorization        Type       Factors              Comments
------------------------------------------------------------------------
tensor (id = 1)      cpd        A,B,C                Dimension mismatch

The comment for the factorization states Dimension mismatch. In the Matlab Command Window, this is a link and when clicked, the following error is shown:

qwervxz
Error in line model.factorizations.tensor.cpd:

    Factorization tensor (id = 1): the size of the tensor generated by the
    factors [11x10x12] should match the size of the data [10x11x12]

This is indeed the problem. For more complicated models, this often helps locating possible errors. If sdf_check(model, 'print') does not give an error, the available solvers in Tensorlab should run without problems. If there is a syntax or consistency error, the solvers will not run, as the model is invalid (sdf_check is called internally as well).

When a model is valid, i.e., no syntax or consistency errors are found, it can be computed using one of the following solvers:

  • sdf_minf or sdf_nls: a general purpose solver for SDF models.
  • ccpd_minf or ccpd_nls: a specialized solver for symmetric and/or coupled CPDs only, in which no structure, constants or concatenations are allowed. These solvers are recommended when they can be used.

The solvers can be called as follows:

R = 5;
T = ful(cpd_rnd([10 11 12], R));
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,2), R);
model.variables.c = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'C'};

sdf_check(model, 'print'); % recommended

sol = sdf_nls(model);

The solution is a struct containing a variables field and a factors field with the results:

sol.variables
welkichjilk
ans =
    a: [10x5 double]
    b: [11x5 double]
    c: [12x5 double]
sol.factors
dliojerklkd
ans =
    A: [10x5 double]
    B: [11x5 double]
    C: [12x5 double]

The ccpd_minf and ccpd_nls solvers use a different output format for the solution. As there are no transformations on the variables, only the factors are given as output as a cell, e.g., for the model above:

Ures = ccpd_nls(model)
qwerzsz
Ures =
 [10x5 double]    [11x5 double]    [12x5 double]

Similar to the other optimization routines, extra options can be provided, and extra information on the convergence of the algorithm can be inspected using:

% Using key-value pairs
[sol, output] = sdf_nls(model, 'Display', 10, 'MaxIter', 200);

% Using an options structure
options = struct;
options.Display = 10;
options.MaxIter = 200;
[sol, output] = sdf_nls(model, options);

The output contains convergence information, how many iterations are carried out, etc. The output also contains the absolute and relative errors for each factorization in output.abserr and output.relerr. The errors are computed as the Frobenius norm of the difference between the input data and the tensor generated by the model, e.g., for the example above it is output.abserr = frob(T - cpdgen(sol.factors.A, sol.factors.B, sol.factors.C)).

Imposing coupling and symmetry

The data fusion part in Structured Data Fusion means that different datasets can be factorized or decomposed together while sharing factors or underlying variables. In this section a basic coupling between datasets is discussed, followed by a discussion on how different weights (or hyperparameters) can be set. Finally, symmetric decompositions are discussed.

Note

The following examples are meant to explain the different modeling techniques. Actually solving the models may not yield a good solution, as the examples do not necessarily reflect good numerical choices and the initialization strategies are not discussed.

Coupling datasets

Consider the following coupled tensor matrix factorization problem

\[\begin{align} \min_{\mat{A},\mat{B},\mat{C}} \frac{1}{2} \norm{\ten{T} - \ten{M}_{\text{CPD}}(\mat{A},\mat{B},\mat{C})}_{\text{F}}^2 + \frac{1}{2} \norm{\mat{M} - \ten{M}_{\text{CPD}}(\mat{C},\mat{D})}_{\text{F}}^2 \end{align}\]

Note that \(\ten{M}_{\text{CPD}}(\mat{C},\mat{D}) = \mat{C}\mat{D}^{\T}\). This can be modeled easily using SDF by defining two factorizations that have a factor in common:

T = randn(10,11,12);
M = randn(12,13);
R = 4;
model = struct;
model.variables.a = randn(size(T,1),R);
model.variables.b = randn(size(T,2),R);
model.variables.c = randn(size(T,3),R);
model.variables.d = randn(size(M,2),R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factors.D = 'd';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'C'};
model.factorizations.matrix.data = M
model.factorizations.matrix.cpd  = {'C', 'D'};

sdf_check(model, 'print');

As can be seen below, the two factorizations indeed share the common factor matrix \(\mat{C}\):

hghtd
Factor               Size                 Comment
------------------------------------------------------------------------
A                    [10x4]
B                    [11x4]
C                    [12x4]
D                    [13x4]

Factorization        Type       Factors              Comments
------------------------------------------------------------------------
tensor (id = 1)      cpd        A,B,C
matrix (id = 2)      cpd        C,D

Absolute and relative weights

Often different weights are needed for the different factorization terms, i.e., the problem

\[\begin{align} \min_{\mat{A},\mat{B},\mat{C}} \frac{\lambda_1}{2} \norm{\ten{T} - \ten{M}_{\text{CPD}}(\mat{A},\mat{B},\mat{C})}_{\text{F}}^2 + \frac{\lambda_2}{2} \norm{\mat{M} - \ten{M}_{\text{CPD}}(\mat{C},\mat{D})}_{\text{F}}^2 \end{align}\]

is solved instead. Using SDF the weights \(\lambda_1\) and \(\lambda_2\) can be given as absolute weights and as relative weights. Absolute weights can be provided using the weight field:

% variables and factors omitted
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'C'};
model.factorizations.tensor.weight = lambda1;
model.factorizations.matrix.data = M
model.factorizations.matrix.cpd  = {'C', 'D'};
model.factorizations.matrix.weight = lambda2;

Note that the factor \(\frac{1}{2}\) in the objective function is present for implementation purposes. Therefore lambda1 is equal to \(\lambda_1\) and not \(\frac{\lambda_1}{2}\) . To get a term \(\norm{\ten{T} - \ten{M}_{\text{CPD}}(\mat{A}, \mat{B}, \mat{C})}_{\text{F}}^2\), \(\lambda_1\) should be set to 2.

Often, it is easier to use relative weights. The relative weights \(\rho_1\) and \(\rho_2\) can be given as

% variables and factors omitted
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'C'};
model.factorizations.tensor.relweight = rho1;
model.factorizations.matrix.data = M
model.factorizations.matrix.cpd  = {'C', 'D'};
model.factorizations.matrix.relweight = rho2;

Relative weights \(\rho_i\) are automatically translated to absolute weights using the following expression:

\[\lambda_i = 2 \frac{\rho_i}{\sum_j \rho_j} \frac{1}{N_i}\]

in which \(N_i\) is the number of known entries in tensor Ti, i.e., Ni = prod(getsize(Ti)) unless Ti is incomplete, then Ni = length(Ti.val). For regularization terms \(N_i = \sum_f N_i^{(f)}\) in which \(N_i^{(f)}\) is the number of entries in factor \(f\) after expansion. For example:

model = struct;
model.variables.a = rand(10, 5);
model.variables.b = rand(1, 5);
model.factors.A = 'a';
model.factors.B = {'b', @struct_diag};
model.factorizations.reg.regL1 = {'A', 'B'};

Factors A and B are regularized. Factor A has size \(10\times5\). Therefore, \(N_1^{(A)} = 50\). After expansion, factor B has size \(5\times5\) (see sdf_check(model, 'print')). Hence, \(N_1^{(B)} = 25\). Together we have \(N_1 = 75\).

Absolute and relative weights cannot be mixed, i.e., either all factorizations define a weight field (or no field), or all factorizations define a relweight field (or no field). Omitted (relative) weights are assumed to be 1. If absolute nor relative weights are given, relative weights equal to 1 are assumed.

Imposing symmetry

Finally, some special attention is given to decompositions involving symmetry. Symmetry can easily be imposed by using the same factor multiple times in a factorization. For example, consider a fourth-order tensor \(\ten{T}\) that is symmetric in the first two modes and in the last two modes:

\[\begin{align} \min_{\mat{A},\mat{B}} \frac{1}{2} \norm{\ten{T} - \ten{M}_{\text{CPD}}(\mat{A},\mat{A},\mat{B},\mat{B})}_{\text{F}}^2 \end{align}\]

This can be implemented as:

R  = 5;
% create symmetric tensor
U0 = cpd_rnd([10 11], R);
U0 = U0([1 1 2 2]);
T  = cpdgen(U0);
% Create model
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factorizations.symm.data = T;
model.factorizations.symm.cpd  = {'A', 'A', 'B', 'B'};

Often when imposing symmetry, especially symmetry in all modes, it is useful to give each rank-1 term a weight \(\alpha_r\):

\[\ten{T} \approx \sum_{r=1}^{R} \alpha_r \vec{u}^{(1)}_r \op \vec{u}^{(1)}_r \op \cdots \op \vec{u}^{(N)}_r.\]

This weight \(\alpha_r\) allows some extra freedom. For example, in the case of a fourth-order tensor and a factorization \(\ten{M}_{\text{CPD}}(\mat{A},\mat{A},\mat{A},\mat{A})\), the diagonal entries of the result are always nonnegative, while it is perfectly possible to have a fully symmetric tensor with negative diagonal entries. Another example imposes orthogonality structure (struct_orth) on all factors \(\mat{A}\) in \(\ten{M}_{\text{CPD}}(\mat{A},\mat{A},\mat{A})\). As struct_orth normalizes each column to have norm one, each rank-1 term also has norm one, which can result in a bad fit. In these two examples, adding the weight \(\alpha_r\) is important to obtain a meaningful fit. Adding the weights \(\alpha_r\) can easily be done in Tensorlab by creating an additional factor matrix of size \(1\times R\), i.e., a row vector:

R  = 5;
% create symmetric tensor
U0 = cpd_rnd([10 11], R);
U0 = U0([1 1 2 2]);
T  = cpdgen(U0);
% Create model
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.b = randn(size(T,3), R);
model.variables.c = randn(1, R);
model.factors.A = 'a';
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.symm.data = T;
model.factorizations.symm.cpd  = {'A', 'A', 'B', 'B', 'C'};

This is a valid model, as can be verified using sdf_check(model, 'print'). The reason is that \(\ten{T}\) can be seen as a tensor of size \(10\times10\times11\times11\times1\).

Imposing structure on factors

Structure by transformation of variables

Using the SDF language, it is easy to create structured factors. The way to do so, is by transforming a variable into a factor using some transformation. Some examples:

  • Nonnegativity can be imposed on a factor by squaring underlying variables (struct_nonneg) [37].
  • A factor matrix in which each column is a polynomial evaluated in some points can be modeled by the coefficients of these polynomials. These coefficients are the variables (struct_poly).
  • Orthogonality can be imposed using Householder reflectors as underlying variables (struct_orth).

See help Contents.m to get an overview of all implemented transformations. All functions starting with struct_ are transformations. New transformations can be defined as well (see Structured data fusion: implementing a transformation).

In Tensorlab, structure can be imposed on factors as follows. For example, to impose nonnegativity on factors A and B the following model can be used:

M = rand(10,10);
model = struct;
model.variables.a = randn(10,3);
model.variables.b = randn(10,3);
model.factors.A = {'a', @struct_nonneg};
model.factors.B = {'b', @struct_nonneg};
model.factorizations.matrix.data = M;
model.factorizations.matrix.cpd  = {'A', 'B'};

sdf_check(model, 'print');
jftuyj
Factor               Size                 Comment
------------------------------------------------------------------------
A                    [10x3]
B                    [10x3]

Factorization        Type       Factors              Comments
------------------------------------------------------------------------
matrix (id = 1)      cpd        A,B

In the Matlab Command Window, A and B can be clicked to get the different transformations applied to the factors, e.g. for A:

etyret
Expansion of factor A (id = 1) (view):
  (1,1)  a               [10x3]     struct_nonneg        Factor          [10x3]

When clicking a and Factor we see indeed that every number in a is squared:

cbncvnc
ans =
 0.1383    0.9000   -0.6235
-0.3784   -1.5312   -1.3501
 0.1431    0.5046   -1.1622
 1.6050   -0.8642   -0.9443
 1.3491   -0.3766   -0.6712
-0.4484    0.7880    0.5767
 0.1684    0.2982   -2.0858
-1.1205   -0.1637    0.2360
 0.4007    0.6067   -0.7784
 0.7391    1.6345    1.0996
ghukjug
ans =
 0.0191    0.8101    0.3887
 0.1432    2.3446    1.8228
 0.0205    0.2546    1.3508
 2.5761    0.7469    0.8917
 1.8201    0.1418    0.4505
 0.2011    0.6210    0.3326
 0.0284    0.0889    4.3504
 1.2555    0.0268    0.0557
 0.1606    0.3681    0.6059
 0.5462    2.6716    1.2090

For some transformations extra parameters are required. For example, to model a factor matrix in which each column is a polynomial, struct_poly can be used. struct_poly needs the evaluation points \(t_i\) for these polynomials as extra parameter. In the example below, each column of factor A is a polynomial of maximal degree \(d\):

M = rand(11,10);
t = 0:0.1:1; % evaluation points
d = 4;       % degree
R = 3;       % rank
model = struct;
model.variables.c = randn(R,d+1); % note that c has R rows!
model.variables.b = randn(10,R);
model.factors.A = {'c', @(z,task) struct_poly(z,task,t)};
model.factors.B = {'b', @struct_nonneg};
model.factorizations.matrix.data = M;
model.factorizations.matrix.cpd  = {'A', 'B'};

sdf_check(model, 'print');
zxdzd
Factor               Size                 Comment
------------------------------------------------------------------------
A                    [11x3]
B                    [10x3]

Factorization        Type       Factors              Comments
------------------------------------------------------------------------
matrix (id = 1)      cpd        A,B

As can be seen, A is an \(11\times 3\) matrix. If A is clicked, the effect of the transformation can be seen:

wretwq
Expansion of factor A (id = 1) (view):
  (1,1)  c               [3x5]      struct_poly          Factor          [11x3]

Factor can be clicked again to get the resulting factor. Note that when a link, e.g, Factor or view, is clicked, the result becomes available as ans. Hence, the resulting factor A can be plotted using

A = ans;
plot(t, A);

Finally, different transformations can be chained. For example:

M = rand(11,10);
t = 0:0.1:1; % evaluation points
d = 4;       % degree
R = 3;       % rank
model = struct;
model.variables.c = randn(R,d+1); % note that c has R rows!
model.variables.b = randn(10,R);
model.factors.A = {'c', @(z,task) struct_poly(z,task,t), @struct_nonneg};
model.factors.B = {'b', @struct_nonneg};
model.factorizations.matrix.data = M;
model.factorizations.matrix.cpd  = {'A', 'B'};

sdf_check(model, 'print');

Clicking A again, allows us to view the individual transformations:

4s4tt4
Expansion of factor A (id = 1) (view):
  (1,1)  c               [3x5]      struct_poly          temp            [11x3]
         temp            [11x3]     struct_nonneg        Factor          [11x3]

Any number of transformations can be chained, as long as the final result is a valid factor:

M = rand(11,10);
t = 0:0.1:1; % evaluation points
d = 4;       % degree
R = 5;       % rank

diagonal = @struct_diag;
poly     = @(z,task) struct_poly(z,task,t);
nonneg   = @struct_nonneg;
negate   = @(z,task) struct_matvec(z, task, [], -eye(R));

model = struct;
model.variables.c = randn(1,d+1); % note that c has R rows!
model.variables.b = randn(10,R);
model.factors.A = {'c', diagonal, poly, nonneg, negate};
model.factors.B = {'b', nonneg};
model.factorizations.matrix.data = M;
model.factorizations.matrix.cpd  = {'A', 'B'};

sdf_check(model, 'print');

This example creates a factor A in which each column \(r\) is negative polynomial of the form \(-(t^{r-1})^2\). For readability the shorter names for the transformations have been created first.

Constant factors

Constant factors can be defined as follows. Consider the following CPD in which the second factor matrix is kept constant:

T = randn(10, 11, 12);
R = 5;
model = struct;
model.variables.a = randn(size(T,1), R);
model.variables.c = randn(size(T,3), R);
model.factors.A = 'a';
model.factors.B = rand(size(T,2), R);
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'C'};

sdf_check(model, 'print');
d5yd5y5
Factor               Size                 Comment
------------------------------------------------------------------------
A                    [10x5]
B                    [11x5]               Constant
C                    [12x5]

Factorization        Type       Factors              Comments
------------------------------------------------------------------------
tensor (id = 1)      cpd        A,B,C

The factor B is indeed recognized as constant. Constants follow the same rules as variables. This means that constants can be cells or numerical arrays of arbitrary dimensions (except a scalar, see Structured data fusion: specification of domain specific language), and that constants can be transformed.

Subfactors explains how partially constant factors can be constructed. Here, we only consider a special case in which specific entries are constant. As an example, let the strictly upper triangular part of one factor matrix, say A, be fixed to zero. The following initial value is created:

A = randn(10,5);
[i,j] = ndgrid(1:10,1:5);
A(j > i) = 0
fd6ufu
A =
 2.0684         0         0         0         0
 0.1294   -0.1904         0         0         0
 0.5001   -0.1355   -0.3705         0         0
 1.9050    0.8140   -0.3259   -0.5112         0
 0.5555    1.0007    1.6666   -0.4472   -1.8578
-0.4579    0.2320    1.2689   -1.6985   -0.7134
-0.1454   -0.0285   -1.0390   -0.8552    0.7025
-0.2466   -0.9286   -0.8979    1.6053   -0.0543
-0.3119   -1.6001    0.7471    1.5946   -0.2169
 1.0205   -0.7063   -1.3862    1.1280   -1.6644

To keep the strictly upper triangular part constant, the transformation struct_const can be used using the additional mask argument (see help struct_const). This mask is true except for the entries in the strictly upper triangular part:

mask = true(size(A));
mask(i > j) = false;

The model can now be created as:

T = randn(10, 11, 12);
model = struct;
model.variables.a = A;
model.variables.b = randn(11, 5);
model.variables.c = randn(12, 5);
model.factors.A = {'a', @(z,task) struct_const(z,task,mask)};
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'C'};

The tensor decomposition is now computed in the same way as in the unconstrained case, with the exception that the derivatives of the entries in the strictly upper triangular part of A are explicitly set to zero. The entries in the strictly upper triangular part therefore stay the same as in the initial value.

Subfactors

In the previous examples, we have assumed that every factor consists of a single subfactor. Factors may however consist of an arbitrary number of subfactors. Consider the following example in which one factor has a block-diagonal structure:

model = struct;
model.variables.a11 = randn(5,2);
model.variables.a22 = randn(5,3);
model.variables.b   = randn(11,5);
model.variables.c   = randn(12,5);
model.factors.A = {'a11', zeros(5,3); zeros(5,2), 'a22'};
model.factors.B = 'b';
model.factors.C = 'c';
model.factorizations.tensor.data = T;
model.factorizations.tensor.cpd  = {'A', 'B', 'C'};

sdf_check(model, 'print');
g7i7i
 Factor               Size                 Comment
 ------------------------------------------------------------------------
 A                    [10x5]               Partially constant
 B                    [11x5]
 C                    [12x5]

 Factorization        Type       Factors              Comments
 ------------------------------------------------------------------------
 tensor (id = 1)      cpd        A,B,C

The factor A is partially constant as the off-diagonal blocks do not depend on any variable. When clicking on A in the Matlab Command Window, an overview for this factor matrix is printed:

ho8o8ho
 Expansion of factor A (id = 1) with structure [2x2] (view):
   (1,1)  a11             [5x2]
   (2,1)  (Constant)      [5x2]
   (1,2)  (Constant)      [5x3]
   (2,2)  a22             [5x3]

This overview correctly states that A consists of 2x2 subfactors. Each entry in this \(2\times2\) structure can be accessed individually as before. It is also possible to impose structure on a subfactor, e.g., nonnegativity can be imposed on block a11 as follows:

model.factors.A = { {'a11', @struct_nonneg}, zeros(5,3);
                    zeros(5,2),              'a22'       };

Running sdf_check(model, 'print') again, and clicking on A gives the expected output:

6tufd6
Expansion of factor A (id = 1) with structure [2x2] (view):
  (1,1)  a11             [5x2]      struct_nonneg        Factor          [5x2]
  (2,1)  (Constant)      [5x2]
  (1,2)  (Constant)      [5x3]
  (2,2)  a22             [5x3]

In the previous examples, all factors are actually matrices or vectors. A factor can be a tensor as well, and it is also possible to concatenate subfactors along modes higher than two, for example:

model = struct;
model.variables.u  = randn(10,2);
model.variables.v  = randn(11,3);
model.variables.w  = randn(12,3);
model.variables.s1 = randn(2,3,2);
model.variables.s2 = randn(2,3,1);
model.factors.U = 'u';
model.factors.V = 'v';
model.factors.W = 'w';
model.factors.S = reshape({'s1', 's2'}, 1, 1, 2);
model.factorizations.tensor.data  = T;
model.factorizations.tensor.lmlra = {'U','V','W','S'};

sdf_check(model, 'print');

In the expansion of S (click on S), we see that the parts s1 and s2 are indeed concatenated in the third mode (recall that matrix s2 is a \(2\times3\times1\) tensor):

srxx44
Expansion of factor S (id = 4) with structure [1x1x2] (view):
  (1,1,1)  s1              [2x3x2]
  (1,1,2)  s2              [2x3]

Factorization and regularization types

Currently, the SDF language supports four different factorization types (CPD, BTD, LMLRA and the decomposition in multilinear rank-\((L_r,L_r,1)\) terms) and three regularization types (L0, L1 and L2 regularization). The different models and factorization types are explained in their respective sections: Canonical polyadic decomposition, Block term decomposition, Multilinear singular value decomposition and low multilinear rank approximation and Decomposition in multilinear rank- terms. In the remaining part of this section we specify the format for the different factorization types:

  • CPD: a (canonical) polyadic decomposition of a tensor \(\ten{T}\) with factor matrices \(\mat{A}\), \(\mat{B}\), \(\mat{C}\), ... can be computed using

    model.factorizations.mycpd.data = T;
    model.factorizations.mycpd.cpd  = {'A', 'B', 'C', ...};
    

    In the case \(\ten{T}\) is an incomplete tensor, specialized routines can be used if the sdf_nls solver is used. To activate these specialized routines use

    model.factorizations.mycpd.data = T;
    model.factorizations.mycpd.cpdi = {'A', 'B', 'C', ...};
    
  • LMLRA: a low multilinear rank approximation of a tensor \(\ten{T}\) with factor matrices \(\mat{A}\), \(\mat{B}\), \(\mat{C}\), ... and core tensor \(\ten{S}\) can be computed using

    model.factorizations.mylmlra.data   = T;
    model.factorizations.mylmlra.lmlra  = {'A', 'B', 'C', ..., 'S'};
    % or
    model.factorizations.mylmlra.lmlra  = {{'A', 'B', 'C', ..., 'S'}};
    
  • BTD: an \(R\)-term block term decomposition of a tensor \(\ten{T}\) with factor matrices \(\mat{A}_r\), \(\mat{B}_r\), \(\mat{C}_r\), ... and core tensors \(\ten{S}_r\) for \(r = 1,\ldots,R\) can be computed using

    model.factorizations.mybtd.data = T;
    model.factorizations.mybtd.btd  = {{'A1', 'B1', 'C1', ..., 'S1'}, ...,
                                       {'AR', 'BR', 'CR', ..., 'SR'}};
    
  • LL1: a decomposition in multilinear rank-\((L_r,L_r,1)\) terms of a third-order tensor \(\ten{T}\) with factor matrices \(\mat{A}\), \(\mat{B}\), \(\mat{C}\) and L = [L1 L2 ... LR] can be computed using

    model.factorizations.myll1.data = T;
    model.factorizations.myll1.ll1  = {'A', 'B', 'C'};
    model.factorizations.myll1.L    = L;
    

    Note that the L field is obligatory in this case. Only the CPD format for a decomposition in multilinear rank-\((L_r,L_r,1)\) terms is accepted, see Decomposition in multilinear rank- terms.

In all cases above, both the data and the field that specifies the factorization type are obligatory.

Three regularization types, namely, L1 and L2 regularization (available since Tensorlab 2.0) and L0 regularization (available since Tensorlab 3.0), are currently available:

  • L1 regularization: a term \(\frac{1}{2}\|\vectorize(\mat{A})\|_1\) or \(\frac{1}{2}(\|\vectorize(\mat{A})\|_1 + \|\vectorize(\ten{B})\|_1 + ...)\) is added to factors, which can be matrices or tensors, using

    model.factorizations.myreg.regL1 = 'A';
    % or
    model.factorizations.myreg.regL1 = {'A', 'B', ...};
    

    All factors mentioned in a single regularization term get the same weight (see Absolute and relative weights). If different factors should have different weights, multiple regularization terms have to be added, e.g.,

    % A and B have the same weight
    model.factorizations.myreg1.regL1 = {'A', 'B'};
    model.factorizations.myreg1.relweight = 1;
    % C gets a different weight
    model.factorizations.myreg2.regL1 = 'C';
    model.factorizations.myreg2.relweight = 10;
    

    By default, a term is regularized towards zero. A term \(\frac{1}{2}(\|\vec{0} - \vectorize(\mat{A})\|_1 + \|\vec{1} - \vectorize(\mat{B})\|_1)\) can be added using

    model.factorizations.nonzeroreg.data = {zeros(size(A)), ones(size(B))};
    model.factorizations.nonzeroreg.regL1 = {'A', 'B'};
    
  • L2 regularization: a term \(\frac{1}{2}\|\mat{A}\|_{\text{F}}^2\) or \(\frac{1}{2}(\|\mat{A}\|_{\text{F}}^2 + \|\ten{B}\|_{\text{F}}^2 + ...)\) is added to factors, which can be matrices or tensors, using

    model.factorizations.myreg.regL2 = 'A';
    % or
    model.factorizations.myreg.regL2 = {'A', 'B', ...};
    

    The same remarks as for L1 regularization hold for L2 regularization as well.

  • L0 regularization: a term \(\|\vectorize(\mat{A})\|_{0}\) or \((\|\vectorize(\mat{A})\|_{0} + \|\vectorize(\ten{B})\|_{0} + ...)\) is added to factors, which can be matrices or tensors, using

    model.factorizations.myreg.regL0 = 'A';
    % or
    model.factorizations.myreg.regL0 = {'A', 'B', ...};
    

    For computational reasons, a smoothed variant of the L0 norm is used: \(\|\vectorize(\mat{A})\|_0\) is approximated by \(\sum_{i=1}^{I} 1 - \text{exp}(-\frac{a_i^2}{\sigma^2})\) with \(I\) the number of entries in \(\mat{A}\). The parameter \(\sigma\) can be a scalar constant or a function of the iteration k and the cell of regularized factors x. The value can be set using the sigma field:

    model.factorizations.myreg.regL0 = 'A';
    model.factorizations.myreg.sigma = 0.01; % (the default), or
    model.factorizations.myreg.sigma = @(k,x) 0.01*frob(x{1}); % x is cell of factors
    

    The same remarks as for L1 regularization hold for L0 regularization as well.

Similar to the other optimization-based decomposition algorithms, the SDF algorithms sdf_minf, sdf_nls, ccpd_minf and ccpd_nls accept both full, sparse, incomplete and structured tensors as data in the data field. Structured tensors can, for example, be given in the LMLRA format, the tensor train format, the Hankel format or the Loewner format. See Datasets: dense, incomplete, sparse and structured for more information. (Structured tensors are not to be confused with structure related to factors.) For structured tensors the maximal achievable numerical accuracy is only half the machine precision due to the way the structure is exploited by the algorithm. This is usually not a problem as noise, inexact models, regularization or chosen tolerances limit the attainable accuracy. If accuracy up to machine precision is required, the accuracy of the result sol of the decomposition of the structured tensor(s) can be improved using an additional refinement step. This refinement step uses sol.variables as initial guess for the same problem, but all structured tensors are replaced by their expanded form. For example, if T is a structured tensor, we replace

model.factorization.fac.data = T;

by

model.variables = sol.variables;
model.factorization.fac.data = ful(T);

Then we run the solver again using the updated model. It is often useful to limit the maximum number of iterations, e.g., by setting MaxIter to 3 or 5. In case a high accuracy is required, High accuracy for NLS algorithms may be of interest.

Further reading

The remaining sections on structured data fusion handle other, more advanced concepts of the SDF language. More concrete examples and useful tricks can be found in Structured data fusion: examples. One section in the advanced chapter (Structured data fusion: advanced concepts) is particularly useful: Using indexed notations explains how the indexed notation can be used to simplify models. This also allows the auto-wrapping feature to be used to its fullest potential, and it makes it easier to create more dynamical models. (Auto-wrapping basically means that most braces {} are added automatically in unambiguous cases.)