Singular Value Decomposition

We redesign the API of the SVD function of SciPy with wish.

Warning

This example requires having NumPy and SciPy installed.

Context

The SciPy library implements singular value decomposition (SVD), a matrix factorization that is instrumental in numerical analysis, machine learning and data science.

The SVD of a matrix A provides three matrices U, Sigma and V. The matrix S has the same shape as A and is diagonal (S[i,j] == 0 unless i == j). The matrices U and V are unitary. Refer for example to SVD on Wikipedia for more information.

In any case, to compute A from U, S and V, do:

>>> from numpy import dot
>>> Vh = V.conjugate().transpose()
>>> A = dot(U, dot(S, Vh))

Basic API

The svd function of SciPy works like this: you provide the matrix A and it computes a matrix U, a vector s and a matrix Vh1:

>>> from scipy.linalg import svd
>>> U, s, Vh = svd(A)

The default convention deviates a little from the definition of the SVD: the vector s is the diagonal of S instead of S itself and Vh corresponds to the conjugate transpose of V instead of V.

While there are good reasons for these conventions, I'd rather follow the principle of least astonishment. Thus, by default our SVD implementation returns the triple (U, S, V):

>>> from wish.examples import svd
>>> U, S, V = svd(A)

Say that our starting point is

def svd(A):
    U, S, V = ...
    return U, S, V

To enable the simplest version of selectable return values, we change the code to:

import wish

def svd(A, returns="U, S, V"):
    U, S, V = ... 
    return wish.grant(returns)

If you want to check the names that the user requires, do instead

def svd(A, returns="U, S, V"):
    wishlist = wish.make(returns)
    for name in wishlist:
        if name not in ["U", "S", "V"]:
             error = "{0!r} is not a valid return value"
             raise NameError(error.format(name))
    U, S, V = ... 
    return wishlist.grant()

Now, we believe that it should be possible to get the same output of SciPy. Tt's rather simple since we have enabled selectable return values:

>>> U, s, Vh = svd(A, returns="U, s, Vh")

The corresponding implementation is:

import numpy
import wish

def svd(A, returns="U, S, V"):
    wishlist = wish.make(returns)
    for name in wishlist:
        if name not in ["U", "S", "V", "s", "Vh"]:
             error = "{0!r} is not a valid return value"
             raise NameError(error.format(name))
    U, S, V = ... 
    if "s" in wishlist:
        s = numpy.diagonal(S)
    if "Vh" in wishlist:
        Vh = V.conjugate().transpose()
    return wishlist.grant()

Singular Values

We frequently need to compute only the singular values of a matrix. The scipy svd achieves this with an additional option compute_uv that can be set to False as describe in the documentation:

compute_uv: bool, optional.

Whether to compute also U and Vh in addition to s. Default is True.

It is used like this:

>>> from scipy.linalg import svd
>>> s = svd(M, compute_uv=False)

This interface is not totally satisfactory: while we easily understand that we won't compute -- and therefore won't return -- U and Vh, it is not totally obvious what is returned. And if in the first place you were only interested in the s array, you may not know what U and Vh are ... There is nothing here that cannot be solved by reading the documentation of svd, but still this is not optimal. The SciPy folks are apparently aware of this because they have implemented a helper function, svdvals, to avoid calling svd in this case:

svdvals(a, overwrite_a=False, check_finite=True)

Compute singular values of a matrix.

With svdvals, we can simply do

>>> from scipy.linalg import svdvals
>>> s = svdvals(A)

but of course now, you have another function to remember of. And while it is a one-liner, its full documentation essentially duplicates the one of svd.

With the wish version, the equivalent function call is simply

>>> from wish.examples import svd
>>> s = svd(A, returns="s")

We know what is returned and a no new function is necessary.

Now, if we have a simpler algorithm to provide only S, we may check the list of wishes and proceed accordingly:

import numpy
import wish

def svd(A, returns="U, S, V"):
    wishlist = wish.make(returns)
    for name in wishlist:
        if name not in ["U", "S", "V", "s", "Vh"]:
             error = "{0!r} is not a valid return value"
             raise NameError(error.format(name))

    if "U" in wishlist or "V" in wishlist or "Vh" in wishlist:
        U, S, V = ... 
    else:
        S = ...

    if "s" in wishlist:
        s = numpy.diagonal(S)
    if "Vh" in wishlist:
        Vh = V.conjugate().transpose()

    return wishlist.grant()

Reconstruction

Since SciPy svd returns the diagonal s of S instead of S itself, it comes with the helper function diagsvd to build S from the singular values:

diagsvd(s, M, N)

Construct the sigma matrix in SVD from singular values and size M, N.

It is used like this:

>>> U, s, Vh = svd(A)
>>> S = diagsvd(s, shape(U)[1], shape(Vh)[0])

The matrix S is typically used to reconstruct A:

>>> from numpy import dot
>>> A = dot(U, dot(S, Vh))

Of course with the wish-enabled version, the same effect is obtained with

>>> from wish.examples import svd
>>> U, S, Vh = wsvd(A, returns="U, S, Vh")
>>> A = dot(U, dot(S, Vh))

Now, if you don't want to reconstruct the matrix A but a version of it based on a different set of singular values -- for example to achieve some dimensionality reduction -- this is easy because NumPy already provides the appropriate functions. First, get the singular values with

>>> U, S, Vh, s = w_svd(A, returns="U, S, Vh, s")

then replace the vector of singular values s with a modified vector s2 and inject the result in S

>>> s2 = ...
>>> from numpy import fill_diagonal
>>> fill_diagonal(S, s)
>>> A2 = dot(U, dot(S, Vh))

The New API

The redesigned svd function now has arguably a simpler and more consistent interface:

  • The simplest and most common usage follows closely the domain conventions,

  • The "configurable return values" pattern is supported with the extra parameter returns

  • The ad hoc parameter compute_uv is not needed anymore.

  • Two helper functions svdvals and diagsvd are not needed anymore

The description of parameters and returned values is only slightly different2:

Parameters

  • A: (M, N) array_like

    Matrix to decompose.

  • ...

  • returns: string, optional

    Select the returned values, among U, S, s, V and Vh. Default is "U, S, V".

Returns

The selection of return values is configurable with the returns parameter.

  • U: ndarray

    Unitary matrix having left singular vectors as columns.

  • S : ndarray

    A matrix with the singular values of A, sorted in non-increasing order, in the main diagonal and zeros elsewhere.

  • V: ndarray

    Unitary matrix having right singular vectors as columns.

  • s: ndarray, not returned by default

    The singular values, sorted in non-increasing order.

  • Vh: ndarray, not returned by default

    Conjugate transpose of V


  1. For example, if

      >>> A = [[ 0.0, -2.0,  0.0],
      ...      [ 1.0,  0.0,  0.0]]
    

    then

      >>> U
      array([[1.,  0.],
            [ 0.,  1.]])
      >>> s 
      array([ 2.,  1.])
      >>> Vh 
      array([[ 0., -1.,  0.],
             [ 1.,  0.,  0.],
             [ 0.,  0.,  1.]])
    

  2. we have omitted any reference to the the parameters full_matrices, overwrite_a, check_finite and lapack_driver since the behavior of svd has not changed in this respect. Refer to the Scipy documentation for details.