Computing Tucker via the HOSVD
Copyright 2022 National Technology & Engineering Solutions of Sandia,
LLC (NTESS). Under the terms of Contract DE-NA0003525 with NTESS, the
U.S. Government retains certain rights in this software.
Higher-order Singular Value Decomposition (HOSVD) and Sequentially-truncased HOSVD (ST-HOSVD)
The HOSVD computes a Tucker decomposition of a tensor via a simple process. For each mode \(k\), it computes the \(R_k\) leading left singular values of the matrix unfolding and stores those as factor matrix \(U_k\). Then it computes a ttm
of the original tensor and all the factor matrices to yield the core with shape \((R_1, R_2, \ldots, R_d)\). The core and factor matrices are used to form the ttensor
. The values of \(R_k\) that lead to a good approximation can be computed automatically to yield a specified error tolerance; this is recommended and the default in our code. The ST-HOSVD is an improvement on the HOSVD that does a TTM in each mode before moving on to the next mode. This has the advantage of shrinking the tensor at each step and reducing subsequent computations. ST-HOSVD is the default in the hosvd
code.
L. R. Tucker, Some mathematical notes on three-mode factor analysis, Psychometrika, 31:279-311, 1966, http://dx.doi.org/10.1007/BF02289464
L. D. Lathauwer, B. D. Moor and J. Vandewalle, A multilinear singular value decomposition, SIAM J. Matrix Analysis and Applications, 21(4):1253-1278, 2000, http://dx.doi.org/10.1137/S0895479896305696
N. Vannieuwenhoven, R. Vandebril and K. Meerbergen, A New Truncation Strategy for the Higher-Order Singular Value Decomposition, SIAM J. Scientific Computing, 34(2):A1027-A1052, 2012, http://dx.doi.org/10.1137/110836067
import pyttb as ttb
import numpy as np
eps_machine = np.finfo(float).eps # gets machine epsilon for floats, will be used later
Simple example of usage
# Create random tensor with shape (10, 20, 30) with core with shape (2, 3, 4)
def generate_sample_tensor():
np.random.seed(0)
noise = 0.01
core = ttb.tensor(np.random.rand(2, 3, 4), shape=(2, 3, 4)) # The core tensor.
U = [
np.random.rand(10, 2),
np.random.rand(20, 3),
np.random.rand(30, 4),
] # The factor matrices.
Soln = ttb.ttensor(core, U) # Create the solution ttensor.
Z = Soln.full()
Rdm = ttb.tenrand((10, 20, 30))
Data = Z + noise * Z.norm() * Rdm / Rdm.norm()
return Soln, Data
S, X = generate_sample_tensor()
# Compute HOSVD with desired relative error = 0.01
T = ttb.hosvd(input_tensor=X, tol=0.01)
# Check shape of core
coreshape = T.core.shape
print(f"Shape of core: {coreshape}")
# Check relative error
relerr = (X - T.double()).norm() / X.norm()
print(f"Relative error: {relerr}")
WARNING:root:Selected no copy, but input data isn't F ordered so must copy.
Computing HOSVD...
Shape of core: (2, 3, 4)
||X-T||/||X|| = 0.00508066 <= 0.010000 (tol)
Shape of core: (2, 3, 4)
Relative error: 0.005080663859987105
Generate a core with different accuracies for different shapes
We will create a core tensor
that has is nearly block diagonal. The blocks are expontentially decreasing in norm, with the idea that we can pick off one block at a time as we increase the prescribed accuracy of the HOSVD. To do this, we define and use a function tenrandblk()
.
def tenrandblk(verbose=True):
np.random.seed(0)
# Block shapes (need not be cubic). Number of rows is the number
# of levels and number of columns is the order of the tensor.
bsz = np.array([[3, 2, 1], [2, 2, 2], [2, 3, 4]])
# Squared norm of each block. Must be length L and sum to <= 1
bns = np.array([0.9, 0.09, 0.009])
# Find the end of each block
bend = np.cumsum(bsz, 0)
# Extract shape: D = # dimensions, L = # levels
D, L = np.shape(bsz)
# Final shape
gsz = bend[-1, :]
## Create tensor
# Figure out norm of off-block-diagonal
dltnrmsqr = 1 - np.sum(bns)
# Create pattern for off-block-diagonal to be modified as we go
dltpattern = np.ones(tuple(gsz))
# Create tensor to fill in
G = ttb.tenzeros(tuple(gsz))
# Create random entries to use
A = np.sign(np.random.randn(*gsz))
B = 0.1 * np.random.rand(*gsz) + 0.9
Grnd = ttb.tensor(
np.sign(np.random.randn(*gsz)) * (0.1 * np.random.rand(*gsz) + 0.9)
)
# Loop through and create blocks
for i in range(L):
# Figure out ith block pattern
blkrange = []
for k in range(D):
if i == 0:
blkrange.append(np.arange(bend[i, k]))
else:
blkrange.append(np.arange(bend[i - 1, k], bend[i, k]))
# Create pattern that has ones for the block
pattern = np.zeros(tuple(gsz))
ix = np.ix_(blkrange[2], blkrange[0], blkrange[1])
pattern[ix] = 1
# Zero out block in the off-diagonal pattern
dltpattern[ix] = 0
# Randomly fill delta-pattern and rescale
block = Grnd * pattern
sse = (block**2).collapse()
block *= np.sqrt(bns[i] / sse)
# Add to main tensor
G += block
# Verbose output
if verbose:
print(
f"Created block with shape {tuple(bsz[i,:])} with norm ({block.norm()})^2 = {block.norm()**2}"
)
if dltnrmsqr > 0:
# final pattern
block = Grnd * dltpattern
sse = (block**2).collapse()
block *= np.sqrt(dltnrmsqr / sse)
G += block
if verbose:
print(
f"Created tensor with shape {tuple(gsz)} with off-block-diagonal norm ({block.norm()})^2 = {block.norm()**2}"
)
return G, bsz, bns
# Create core tensor with given block structure and norm 1
G, bsz, bns = tenrandblk()
Created block with shape (np.int64(3), np.int64(2), np.int64(1)) with norm (0.9486832980505139)^2 = 0.9000000000000001
Created block with shape (np.int64(2), np.int64(2), np.int64(2)) with norm (0.3)^2 = 0.09
Created block with shape (np.int64(2), np.int64(3), np.int64(4)) with norm (0.09486832980505137)^2 = 0.009
Created tensor with shape (np.int64(7), np.int64(7), np.int64(7)) with off-block-diagonal norm (0.031622776601683805)^2 = 0.0010000000000000007
print(f"Shape of G: {G.shape}")
Shape of G: (7, 7, 7)
Generate data tensor
with core described above
We take the core G
and embed into into a larger tensor X
by using orthogonal transformations. The true rank of this tensor is equal to the shape of G
.
from scipy.stats import ortho_group
def mat_rand_orth(N, seed=0):
# Generates random n x n orthogonal real matrix.
return ortho_group.rvs(N, seed)
# shape of X
xsz = np.array([20, 20, 20])
# Create orthogonal matrices
U = []
for k in np.arange(3):
V = mat_rand_orth(xsz[k])[:, : G.shape[k]]
U.append(V)
# Create X
X = ttb.ttensor(G, U).full()
# The norm should be unchanged
print(f"||X|| = {X.norm()}")
||X|| = 1.0
Compute (full) HOSVD
We compute the ST-HOSVD using the hosvd
method. We specify the tolerance close to machine precision. Ideally, it finds a core that is the same shape as G.
print("ST-HOSVD...")
T = ttb.hosvd(X, 2 * np.sqrt(eps_machine))
ST-HOSVD...
Computing HOSVD...
Shape of core: (7, 7, 7)
||X-T||/||X|| = 8.61086e-15 <= 0.000000 (tol)
Compute low-rank HOSVD approximation
The norm squared of the first two blocks of G
is 0.99, so specifying an error of 1e-2
should yield a core with shape \((4, 4, 3)\)(mileage may vary). However, the conservative nature of the algorithm means that it may pick something larger. We can compensate by specifying a larger tolerance.
# Using 1e-2 exactly is potentially too conservative...
print("Result with tol = sqrt(1e-2):")
T = ttb.hosvd(X, np.sqrt(1e-2), verbosity=1)
# But a small multiple (i.e., |ndims(X)|) usually works...
print("\nResult with tol = sqrt(3e-2):")
T = ttb.hosvd(X, np.sqrt(3e-2), verbosity=1)
Result with tol = sqrt(1e-2):
Computing HOSVD...
Shape of core: (5, 5, 5)
||X-T||/||X|| = 0.0753888 <= 0.100000 (tol)
Result with tol = sqrt(3e-2):
Computing HOSVD...
Shape of core: (3, 4, 4)
||X-T||/||X|| = 0.0987914 <= 0.173205 (tol)
Similarly, the norm squared of the first block of G
is 0.9, so specifying an error of 1e-1
should result in a core with shape \((3, 2, 1)\).
# Using 1e-1 exactly is potentially too conservative...
print("Result with tol = sqrt(1e-1):")
T = ttb.hosvd(X, np.sqrt(1e-1), verbosity=1)
# But a small multiple (i.e., |ndims(X)|) usually works...
print("\nResult with tol = sqrt(3e-1):")
T = ttb.hosvd(X, np.sqrt(3e-1), verbosity=1)
Result with tol = sqrt(1e-1):
Computing HOSVD...
Shape of core: (2, 3, 3)
||X-T||/||X|| = 0.202337 <= 0.316228 (tol)
Result with tol = sqrt(3e-1):
Computing HOSVD...
Shape of core: (1, 2, 2)
||X-T||/||X|| = 0.316125 <= 0.547723 (tol)
Verbosity - Getting more or less information.
Setting the verbosity to zero suppresses all output. Cranking up the verbosity gives some insight into the decision-making process…
Example 1
T = ttb.hosvd(X, tol=np.sqrt(3e-1), verbosity=10)
Computing HOSVD...
||X||^2 = 1
tol = 0.547723
eigenvalue sum threshold = tol^2 ||X||^2 / d = 0.1
Reverse cumulative sum of evals of Gram matrix:
0: 1.0000 <-- Cutoff
1: 0.0999
2: 0.0322
3: 0.0095
4: 0.0055
5: 0.0023
6: 0.0004
7: -0.0000
8: -0.0000
9: -0.0000
10: -0.0000
11: -0.0000
12: -0.0000
13: -0.0000
14: -0.0000
15: -0.0000
16: -0.0000
17: -0.0000
18: -0.0000
19: -0.0000
Reverse cumulative sum of evals of Gram matrix:
0: 0.9001
1: 0.3140 <-- Cutoff
2: 0.0001
3: 0.0000
4: 0.0000
5: 0.0000
6: 0.0000
7: 0.0000
8: 0.0000
9: -0.0000
10: -0.0000
11: -0.0000
12: -0.0000
13: -0.0000
14: -0.0000
15: -0.0000
16: -0.0000
17: -0.0000
18: -0.0000
19: -0.0000
Reverse cumulative sum of evals of Gram matrix:
0: 0.9001
1: 0.3139 <-- Cutoff
2: -0.0000
3: -0.0000
4: -0.0000
5: -0.0000
6: -0.0000
7: -0.0000
8: -0.0000
9: -0.0000
10: -0.0000
11: -0.0000
12: -0.0000
13: -0.0000
14: -0.0000
15: -0.0000
16: -0.0000
17: -0.0000
18: -0.0000
19: -0.0000
Shape of core: (1, 2, 2)
||X-T||/||X|| = 0.316125 <= 0.547723 (tol)
Example 2
T = ttb.hosvd(X, tol=3 * np.sqrt(eps_machine), verbosity=10)
Computing HOSVD...
||X||^2 = 1
tol = 4.47035e-08
eigenvalue sum threshold = tol^2 ||X||^2 / d = 6.66134e-16
Reverse cumulative sum of evals of Gram matrix:
0: 1.0000
1: 0.0999
2: 0.0322
3: 0.0095
4: 0.0055
5: 0.0023
6: 0.0004 <-- Cutoff
7: -0.0000
8: -0.0000
9: -0.0000
10: -0.0000
11: -0.0000
12: -0.0000
13: -0.0000
14: -0.0000
15: -0.0000
16: -0.0000
17: -0.0000
18: -0.0000
19: -0.0000
Reverse cumulative sum of evals of Gram matrix:
0: 1.0000
1: 0.4137
2: 0.0996
3: 0.0317
4: 0.0093
5: 0.0040
6: 0.0001 <-- Cutoff
7: -0.0000
8: -0.0000
9: -0.0000
10: -0.0000
11: -0.0000
12: -0.0000
13: -0.0000
14: -0.0000
15: -0.0000
16: -0.0000
17: -0.0000
18: -0.0000
19: -0.0000
Reverse cumulative sum of evals of Gram matrix:
0: 1.0000
1: 0.4137
2: 0.0997
3: 0.0322
4: 0.0094
5: 0.0052
6: 0.0021 <-- Cutoff
7: 0.0000
8: -0.0000
9: -0.0000
10: -0.0000
11: -0.0000
12: -0.0000
13: -0.0000
14: -0.0000
15: -0.0000
16: -0.0000
17: -0.0000
18: -0.0000
19: -0.0000
Shape of core: (7, 7, 7)
||X-T||/||X|| = 8.61086e-15 <= 0.000000 (tol)
Specify the ranks
If you know the rank you want, you can specify it. But there’s no guarantee that it will satisfy the specified tolerance. In such cases, the method will throw a warning.
# Rank is okay
T = ttb.hosvd(X, tol=np.sqrt(3e-1), ranks=bsz[0, :].tolist())
# Rank is too small for the specified error
T = ttb.hosvd(X, tol=np.sqrt(3e-1), ranks=[1, 1, 1])
# But you can set the error to the tensor norm to make the warning go away
T = ttb.hosvd(X, tol=X.norm(), ranks=[1, 1, 1])
Computing HOSVD...
Shape of core: (4, 3, 2)
||X-T||/||X|| = 0.316018 <= 0.547723 (tol)
Computing HOSVD...
Shape of core: (2, 2, 2)
||X-T||/||X|| = 0.316111 <= 0.547723 (tol)
Computing HOSVD...
Shape of core: (2, 2, 2)
||X-T||/||X|| = 0.316111 <= 1.000000 (tol)
Specify the mode order
It’s also possible to specify the order of the modes. The default is np.arange(X.ndims)
.
reverse_dimorder = np.arange(X.ndims).tolist()[::-1]
T = ttb.hosvd(X, tol=np.sqrt(3e-1), dimorder=reverse_dimorder)
Computing HOSVD...
Shape of core: (1, 2, 2)
||X-T||/||X|| = 0.316129 <= 0.547723 (tol)
Generate bigger data tensor with core described above
Uses the same procedure as before, but now the shape is bigger.
# shape of X
xsz = np.array([100, 100, 100])
# Create orthogonal matrices
U = []
for k in np.arange(3):
V = mat_rand_orth(xsz[k])[:, : G.shape[k]]
U.append(V)
# Create X
Y = ttb.ttensor(G, U).full()
# The norm should be unchanged
print(f"||Y|| = {Y.norm()}")
||Y|| = 0.9999999999999998
ST-HOSVD compared to HOSVD
The answers are essentially the same for the sequentially-truncated HOSVD and the HOSVD…
print("ST-HOSVD...")
T = ttb.hosvd(Y, tol=2 * np.sqrt(eps_machine))
print("\nHOSVD...")
T = ttb.hosvd(Y, tol=2 * np.sqrt(eps_machine), sequential=False)
ST-HOSVD...
Computing HOSVD...
Shape of core: (7, 7, 7)
||X-T||/||X|| = 5.49618e-15 <= 0.000000 (tol)
HOSVD...
Computing HOSVD...
Shape of core: (7, 7, 7)
||X-T||/||X|| = 5.62518e-15 <= 0.000000 (tol)
But ST-HOSVD may be slightly faster than HOSVD for larger tensors.
print("Time for 5 runs of ST-HOSVD:")
%timeit -n5 T = ttb.hosvd(Y, tol=2 * np.sqrt(eps_machine), verbosity=0)
Time for 5 runs of ST-HOSVD:
114 ms ± 24.5 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)
print("\nTime for 5 runs of HOSVD:")
%timeit -n5 T = ttb.hosvd(Y, tol=2 * np.sqrt(eps_machine), sequential=False, verbosity=0)
Time for 5 runs of HOSVD:
187 ms ± 18.9 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)