Singular Value Decomposition
We redesign the API of the SVD function of SciPy with wish.
Context¶
The SciPy library implements singular value decomposition (SVD), a matrix factorization that is instrumental in numerical analysis, machine learning and data science.
The SVD of a matrix A provides three matrices U, Sigma
and V.
The matrix S has the same shape as A and is diagonal
(S[i,j] == 0 unless i == j). The matrices U and V are unitary.
Refer for example to SVD on Wikipedia
for more information.
In any case, to compute A from U, S and V, do:
>>> from numpy import dot
>>> Vh = V.conjugate().transpose()
>>> A = dot(U, dot(S, Vh))
Basic API¶
The svd function of SciPy works like this: you provide the matrix A
and it computes a matrix U, a vector s and a matrix Vh1:
>>> from scipy.linalg import svd
>>> U, s, Vh = svd(A)
The default convention deviates a little from the definition of the SVD:
the vector s is the diagonal of S instead of S itself
and Vh corresponds to the conjugate transpose of V instead of V.
While there are good reasons for these conventions,
I'd rather follow the principle of least astonishment.
Thus, by default our SVD
implementation returns the triple (U, S, V):
>>> from wish.examples import svd
>>> U, S, V = svd(A)
Say that our starting point is
def svd(A):
U, S, V = ...
return U, S, V
To enable the simplest version of selectable return values, we change the code to:
import wish
def svd(A, returns="U, S, V"):
U, S, V = ...
return wish.grant(returns)
If you want to check the names that the user requires, do instead
def svd(A, returns="U, S, V"):
wishlist = wish.make(returns)
for name in wishlist:
if name not in ["U", "S", "V"]:
error = "{0!r} is not a valid return value"
raise NameError(error.format(name))
U, S, V = ...
return wishlist.grant()
Now, we believe that it should be possible to get the same output of SciPy. Tt's rather simple since we have enabled selectable return values:
>>> U, s, Vh = svd(A, returns="U, s, Vh")
The corresponding implementation is:
import numpy
import wish
def svd(A, returns="U, S, V"):
wishlist = wish.make(returns)
for name in wishlist:
if name not in ["U", "S", "V", "s", "Vh"]:
error = "{0!r} is not a valid return value"
raise NameError(error.format(name))
U, S, V = ...
if "s" in wishlist:
s = numpy.diagonal(S)
if "Vh" in wishlist:
Vh = V.conjugate().transpose()
return wishlist.grant()
Singular Values¶
We frequently need to compute only the singular values of a matrix.
The scipy svd achieves this with an additional option compute_uv
that can be set to False as describe in the documentation:
compute_uv: bool, optional.
Whether to compute also U and Vh in addition to s.
Default is True.
It is used like this:
>>> from scipy.linalg import svd
>>> s = svd(M, compute_uv=False)
This interface is not totally satisfactory:
while we easily understand that we won't
compute -- and therefore won't return -- U and Vh,
it is not totally obvious what is returned.
And if in the first place you were only interested
in the s array, you may not know what U and Vh are ...
There is nothing here that cannot be solved by reading
the documentation of svd, but still this is not optimal.
The SciPy folks are apparently aware of this because they have implemented
a helper function, svdvals, to avoid calling svd in this case:
svdvals(a, overwrite_a=False, check_finite=True)
Compute singular values of a matrix.
With svdvals, we can simply do
>>> from scipy.linalg import svdvals
>>> s = svdvals(A)
but of course now, you have another function to remember of.
And while it is a one-liner, its full documentation essentially
duplicates the one of svd.
With the wish version, the equivalent function call is simply
>>> from wish.examples import svd
>>> s = svd(A, returns="s")
We know what is returned and a no new function is necessary.
Now, if we have a simpler algorithm to provide only S,
we may check the list of wishes and proceed accordingly:
import numpy
import wish
def svd(A, returns="U, S, V"):
wishlist = wish.make(returns)
for name in wishlist:
if name not in ["U", "S", "V", "s", "Vh"]:
error = "{0!r} is not a valid return value"
raise NameError(error.format(name))
if "U" in wishlist or "V" in wishlist or "Vh" in wishlist:
U, S, V = ...
else:
S = ...
if "s" in wishlist:
s = numpy.diagonal(S)
if "Vh" in wishlist:
Vh = V.conjugate().transpose()
return wishlist.grant()
Reconstruction¶
Since SciPy svd returns the diagonal s of S instead of S itself,
it comes with the helper function diagsvd to build
S from the singular values:
diagsvd(s, M, N)
Construct the sigma matrix in SVD from singular values and size M, N.
It is used like this:
>>> U, s, Vh = svd(A)
>>> S = diagsvd(s, shape(U)[1], shape(Vh)[0])
The matrix S is typically used to reconstruct A:
>>> from numpy import dot
>>> A = dot(U, dot(S, Vh))
Of course with the wish-enabled version, the same effect is obtained with
>>> from wish.examples import svd
>>> U, S, Vh = wsvd(A, returns="U, S, Vh")
>>> A = dot(U, dot(S, Vh))
Now, if you don't want to reconstruct the matrix A but a version of it
based on a different set of singular values -- for example to achieve
some dimensionality reduction
-- this is easy because NumPy already provides
the appropriate functions. First, get the singular values with
>>> U, S, Vh, s = w_svd(A, returns="U, S, Vh, s")
then replace the vector of singular values s
with a modified vector s2 and inject the result in S
>>> s2 = ...
>>> from numpy import fill_diagonal
>>> fill_diagonal(S, s)
>>> A2 = dot(U, dot(S, Vh))
The New API¶
The redesigned svd function now has arguably a simpler and more consistent
interface:
-
The simplest and most common usage follows closely the domain conventions,
-
The "configurable return values" pattern is supported with the extra parameter
returns -
The ad hoc parameter
compute_uvis not needed anymore. -
Two helper functions
svdvalsanddiagsvdare not needed anymore
The description of parameters and returned values is only slightly different2:
Parameters
-
A:(M, N)array_likeMatrix to decompose.
-
...
-
returns:string, optionalSelect the returned values, among
U,S,s,VandVh. Default is"U, S, V".
Returns
The selection of return values is configurable with the returns parameter.
-
U:ndarrayUnitary matrix having left singular vectors as columns.
-
S:ndarrayA matrix with the singular values of
A, sorted in non-increasing order, in the main diagonal and zeros elsewhere. -
V:ndarrayUnitary matrix having right singular vectors as columns.
-
s:ndarray, not returned by defaultThe singular values, sorted in non-increasing order.
-
Vh:ndarray, not returned by defaultConjugate transpose of
V
-
For example, if
>>> A = [[ 0.0, -2.0, 0.0], ... [ 1.0, 0.0, 0.0]]then
>>> U array([[1., 0.], [ 0., 1.]]) >>> s array([ 2., 1.]) >>> Vh array([[ 0., -1., 0.], [ 1., 0., 0.], [ 0., 0., 1.]]) -
we have omitted any reference to the the parameters
full_matrices,overwrite_a,check_finiteandlapack_driversince the behavior ofsvdhas not changed in this respect. Refer to the Scipy documentation for details. ↩