Computing Tucker via the HOSVD

Copyright 2022 National Technology & Engineering Solutions of Sandia,
LLC (NTESS). Under the terms of Contract DE-NA0003525 with NTESS, the
U.S. Government retains certain rights in this software.

Higher-order Singular Value Decomposition (HOSVD) and Sequentially-truncased HOSVD (ST-HOSVD)

The HOSVD computes a Tucker decomposition of a tensor via a simple process. For each mode \(k\), it computes the \(R_k\) leading left singular values of the matrix unfolding and stores those as factor matrix \(U_k\). Then it computes a ttm of the original tensor and all the factor matrices to yield the core with shape \((R_1, R_2, \ldots, R_d)\). The core and factor matrices are used to form the ttensor. The values of \(R_k\) that lead to a good approximation can be computed automatically to yield a specified error tolerance; this is recommended and the default in our code. The ST-HOSVD is an improvement on the HOSVD that does a TTM in each mode before moving on to the next mode. This has the advantage of shrinking the tensor at each step and reducing subsequent computations. ST-HOSVD is the default in the hosvd code.

L. R. Tucker, Some mathematical notes on three-mode factor analysis, Psychometrika, 31:279-311, 1966, http://dx.doi.org/10.1007/BF02289464
L. D. Lathauwer, B. D. Moor and J. Vandewalle, A multilinear singular value decomposition, SIAM J. Matrix Analysis and Applications, 21(4):1253-1278, 2000, http://dx.doi.org/10.1137/S0895479896305696
N. Vannieuwenhoven, R. Vandebril and K. Meerbergen, A New Truncation Strategy for the Higher-Order Singular Value Decomposition, SIAM J. Scientific Computing, 34(2):A1027-A1052, 2012, http://dx.doi.org/10.1137/110836067

import pyttb as ttb
import numpy as np

eps_machine = np.finfo(float).eps  # gets machine epsilon for floats, will be used later

Simple example of usage

# Create random tensor with shape (10, 20, 30) with core with shape (2, 3, 4)
def generate_sample_tensor():
    np.random.seed(0)
    noise = 0.01
    core = ttb.tensor(np.random.rand(2, 3, 4), shape=(2, 3, 4))  # The core tensor.
    U = [
        np.random.rand(10, 2),
        np.random.rand(20, 3),
        np.random.rand(30, 4),
    ]  # The factor matrices.
    Soln = ttb.ttensor(core, U)  # Create the solution ttensor.
    Z = Soln.full()
    Rdm = ttb.tenrand((10, 20, 30))

    Data = Z + noise * Z.norm() * Rdm / Rdm.norm()

    return Soln, Data

S, X = generate_sample_tensor()

# Compute HOSVD with desired relative error = 0.01
T = ttb.hosvd(input_tensor=X, tol=0.01)

# Check shape of core
coreshape = T.core.shape
print(f"Shape of core: {coreshape}")

# Check relative error
relerr = (X - T.double()).norm() / X.norm()
print(f"Relative error: {relerr}")

WARNING:root:Selected no copy, but input data isn't F ordered so must copy.

Computing HOSVD...

Shape of core: (2, 3, 4)
||X-T||/||X|| =  0.00508066 <= 0.010000 (tol)
Shape of core: (2, 3, 4)
Relative error: 0.005080663859987105

Generate a core with different accuracies for different shapes

We will create a core tensor that has is nearly block diagonal. The blocks are expontentially decreasing in norm, with the idea that we can pick off one block at a time as we increase the prescribed accuracy of the HOSVD. To do this, we define and use a function tenrandblk().

def tenrandblk(verbose=True):
    np.random.seed(0)
    # Block shapes (need not be cubic). Number of rows is the number
    # of levels and number of columns is the order of the tensor.
    bsz = np.array([[3, 2, 1], [2, 2, 2], [2, 3, 4]])

    # Squared norm of each block. Must be length L and sum to <= 1
    bns = np.array([0.9, 0.09, 0.009])

    # Find the end of each block
    bend = np.cumsum(bsz, 0)
    # Extract shape: D = # dimensions, L = # levels
    D, L = np.shape(bsz)
    # Final shape
    gsz = bend[-1, :]

    ## Create tensor
    # Figure out norm of off-block-diagonal
    dltnrmsqr = 1 - np.sum(bns)
    # Create pattern for off-block-diagonal to be modified as we go
    dltpattern = np.ones(tuple(gsz))
    # Create tensor to fill in
    G = ttb.tenzeros(tuple(gsz))

    # Create random entries to use
    A = np.sign(np.random.randn(*gsz))
    B = 0.1 * np.random.rand(*gsz) + 0.9
    Grnd = ttb.tensor(
        np.sign(np.random.randn(*gsz)) * (0.1 * np.random.rand(*gsz) + 0.9)
    )

    # Loop through and create blocks
    for i in range(L):
        # Figure out ith block pattern
        blkrange = []
        for k in range(D):
            if i == 0:
                blkrange.append(np.arange(bend[i, k]))
            else:
                blkrange.append(np.arange(bend[i - 1, k], bend[i, k]))

        # Create pattern that has ones for the block
        pattern = np.zeros(tuple(gsz))
        ix = np.ix_(blkrange[2], blkrange[0], blkrange[1])
        pattern[ix] = 1

        # Zero out block in the off-diagonal pattern
        dltpattern[ix] = 0

        # Randomly fill delta-pattern and rescale
        block = Grnd * pattern
        sse = (block**2).collapse()
        block *= np.sqrt(bns[i] / sse)

        # Add to main tensor
        G += block

        # Verbose output
        if verbose:
            print(
                f"Created block with shape {tuple(bsz[i,:])} with norm ({block.norm()})^2 = {block.norm()**2}"
            )

    if dltnrmsqr > 0:
        # final pattern
        block = Grnd * dltpattern
        sse = (block**2).collapse()
        block *= np.sqrt(dltnrmsqr / sse)
        G += block
        if verbose:
            print(
                f"Created tensor with shape {tuple(gsz)} with off-block-diagonal norm ({block.norm()})^2 = {block.norm()**2}"
            )

    return G, bsz, bns

# Create core tensor with given block structure and norm 1
G, bsz, bns = tenrandblk()

Created block with shape (np.int64(3), np.int64(2), np.int64(1)) with norm (0.9486832980505139)^2 = 0.9000000000000001
Created block with shape (np.int64(2), np.int64(2), np.int64(2)) with norm (0.3)^2 = 0.09
Created block with shape (np.int64(2), np.int64(3), np.int64(4)) with norm (0.09486832980505137)^2 = 0.009
Created tensor with shape (np.int64(7), np.int64(7), np.int64(7)) with off-block-diagonal norm (0.031622776601683805)^2 = 0.0010000000000000007

print(f"Shape of G: {G.shape}")

Shape of G: (7, 7, 7)

Generate data `tensor` with core described above

We take the core G and embed into into a larger tensor X by using orthogonal transformations. The true rank of this tensor is equal to the shape of G.

from scipy.stats import ortho_group


def mat_rand_orth(N, seed=0):
    # Generates random n x n orthogonal real matrix.
    return ortho_group.rvs(N, seed)


# shape of X
xsz = np.array([20, 20, 20])

# Create orthogonal matrices
U = []
for k in np.arange(3):
    V = mat_rand_orth(xsz[k])[:, : G.shape[k]]
    U.append(V)

# Create X
X = ttb.ttensor(G, U).full()

# The norm should be unchanged
print(f"||X|| = {X.norm()}")

||X|| = 1.0

Compute (full) HOSVD

We compute the ST-HOSVD using the hosvd method. We specify the tolerance close to machine precision. Ideally, it finds a core that is the same shape as G.

print("ST-HOSVD...")
T = ttb.hosvd(X, 2 * np.sqrt(eps_machine))

ST-HOSVD...
Computing HOSVD...

Shape of core: (7, 7, 7)
||X-T||/||X|| =  8.61086e-15 <= 0.000000 (tol)

Compute low-rank HOSVD approximation

The norm squared of the first two blocks of G is 0.99, so specifying an error of 1e-2 should yield a core with shape \((4, 4, 3)\)(mileage may vary). However, the conservative nature of the algorithm means that it may pick something larger. We can compensate by specifying a larger tolerance.

# Using 1e-2 exactly is potentially too conservative...
print("Result with tol = sqrt(1e-2):")
T = ttb.hosvd(X, np.sqrt(1e-2), verbosity=1)

# But a small multiple (i.e., |ndims(X)|) usually works...
print("\nResult with tol = sqrt(3e-2):")
T = ttb.hosvd(X, np.sqrt(3e-2), verbosity=1)

Result with tol = sqrt(1e-2):
Computing HOSVD...

Shape of core: (5, 5, 5)
||X-T||/||X|| =  0.0753888 <= 0.100000 (tol)

Result with tol = sqrt(3e-2):
Computing HOSVD...

Shape of core: (3, 4, 4)
||X-T||/||X|| =  0.0987914 <= 0.173205 (tol)

Similarly, the norm squared of the first block of G is 0.9, so specifying an error of 1e-1 should result in a core with shape \((3, 2, 1)\).

# Using 1e-1 exactly is potentially too conservative...
print("Result with tol = sqrt(1e-1):")
T = ttb.hosvd(X, np.sqrt(1e-1), verbosity=1)

# But a small multiple (i.e., |ndims(X)|) usually works...
print("\nResult with tol = sqrt(3e-1):")
T = ttb.hosvd(X, np.sqrt(3e-1), verbosity=1)

Result with tol = sqrt(1e-1):
Computing HOSVD...

Shape of core: (2, 3, 3)
||X-T||/||X|| =  0.202337 <= 0.316228 (tol)

Result with tol = sqrt(3e-1):
Computing HOSVD...

Shape of core: (1, 2, 2)
||X-T||/||X|| =  0.316125 <= 0.547723 (tol)

Verbosity - Getting more or less information.

Setting the verbosity to zero suppresses all output. Cranking up the verbosity gives some insight into the decision-making process…

Example 1

T = ttb.hosvd(X, tol=np.sqrt(3e-1), verbosity=10)

Computing HOSVD...

||X||^2 =  1
tol =  0.547723
eigenvalue sum threshold = tol^2 ||X||^2 / d =  0.1
Reverse cumulative sum of evals of Gram matrix:
 1.0000 <-- Cutoff
 0.0999
 0.0322
 0.0095
 0.0055
 0.0023
 0.0004
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
Reverse cumulative sum of evals of Gram matrix:
 0.9001
 0.3140 <-- Cutoff
 0.0001
 0.0000
 0.0000
 0.0000
 0.0000
 0.0000
 0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
Reverse cumulative sum of evals of Gram matrix:
 0.9001
 0.3139 <-- Cutoff
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
Shape of core: (1, 2, 2)
||X-T||/||X|| =  0.316125 <= 0.547723 (tol)

Example 2

T = ttb.hosvd(X, tol=3 * np.sqrt(eps_machine), verbosity=10)

Computing HOSVD...

||X||^2 =  1
tol =  4.47035e-08
eigenvalue sum threshold = tol^2 ||X||^2 / d =  6.66134e-16
Reverse cumulative sum of evals of Gram matrix:
 1.0000
 0.0999
 0.0322
 0.0095
 0.0055
 0.0023
 0.0004 <-- Cutoff
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
Reverse cumulative sum of evals of Gram matrix:
 1.0000
 0.4137
 0.0996
 0.0317
 0.0093
 0.0040
 0.0001 <-- Cutoff
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
Reverse cumulative sum of evals of Gram matrix:
 1.0000
 0.4137
 0.0997
 0.0322
 0.0094
 0.0052
 0.0021 <-- Cutoff
 0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
-0.0000
Shape of core: (7, 7, 7)
||X-T||/||X|| =  8.61086e-15 <= 0.000000 (tol)

Specify the ranks

If you know the rank you want, you can specify it. But there’s no guarantee that it will satisfy the specified tolerance. In such cases, the method will throw a warning.

# Rank is okay
T = ttb.hosvd(X, tol=np.sqrt(3e-1), ranks=bsz[0, :].tolist())

# Rank is too small for the specified error
T = ttb.hosvd(X, tol=np.sqrt(3e-1), ranks=[1, 1, 1])

# But you can set the error to the tensor norm to make the warning go away
T = ttb.hosvd(X, tol=X.norm(), ranks=[1, 1, 1])

Computing HOSVD...

Shape of core: (4, 3, 2)
||X-T||/||X|| =  0.316018 <= 0.547723 (tol)
Computing HOSVD...

Shape of core: (2, 2, 2)
||X-T||/||X|| =  0.316111 <= 0.547723 (tol)
Computing HOSVD...

Shape of core: (2, 2, 2)
||X-T||/||X|| =  0.316111 <= 1.000000 (tol)

Specify the mode order

It’s also possible to specify the order of the modes. The default is np.arange(X.ndims).

reverse_dimorder = np.arange(X.ndims).tolist()[::-1]
T = ttb.hosvd(X, tol=np.sqrt(3e-1), dimorder=reverse_dimorder)

Computing HOSVD...

Shape of core: (1, 2, 2)
||X-T||/||X|| =  0.316129 <= 0.547723 (tol)

Generate bigger data tensor with core described above

Uses the same procedure as before, but now the shape is bigger.

# shape of X
xsz = np.array([100, 100, 100])

# Create orthogonal matrices
U = []
for k in np.arange(3):
    V = mat_rand_orth(xsz[k])[:, : G.shape[k]]
    U.append(V)

# Create X
Y = ttb.ttensor(G, U).full()

# The norm should be unchanged
print(f"||Y|| = {Y.norm()}")

||Y|| = 0.9999999999999998

ST-HOSVD compared to HOSVD

The answers are essentially the same for the sequentially-truncated HOSVD and the HOSVD…

print("ST-HOSVD...")
T = ttb.hosvd(Y, tol=2 * np.sqrt(eps_machine))
print("\nHOSVD...")
T = ttb.hosvd(Y, tol=2 * np.sqrt(eps_machine), sequential=False)

ST-HOSVD...
Computing HOSVD...

Shape of core: (7, 7, 7)
||X-T||/||X|| =  5.49618e-15 <= 0.000000 (tol)

HOSVD...
Computing HOSVD...

Shape of core: (7, 7, 7)
||X-T||/||X|| =  5.62518e-15 <= 0.000000 (tol)

But ST-HOSVD may be slightly faster than HOSVD for larger tensors.

print("Time for 5 runs of ST-HOSVD:")
%timeit -n5 T = ttb.hosvd(Y, tol=2 * np.sqrt(eps_machine), verbosity=0)

Time for 5 runs of ST-HOSVD:

114 ms ± 24.5 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)

print("\nTime for 5 runs of HOSVD:")
%timeit -n5 T = ttb.hosvd(Y, tol=2 * np.sqrt(eps_machine), sequential=False, verbosity=0)

Time for 5 runs of HOSVD:

187 ms ± 18.9 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)