# Dual norm

Jump to navigation Jump to search

In functional analysis, the dual norm is a measure of the "size" of continuous linear functionals defined on a normed space.

## Definition

Let ${\displaystyle X}$ and ${\displaystyle Y}$ be topological vector spaces, and ${\displaystyle L(X,Y)}$[1] be the collection of all bounded linear mappings (or operators) of ${\displaystyle X}$ into ${\displaystyle Y}$. In the case where ${\displaystyle X}$ and ${\displaystyle Y}$ are normed vector spaces, ${\displaystyle L(X,Y)}$ can be normed in a natural way.

When ${\displaystyle Y}$ is a scalar field (i.e. ${\displaystyle Y={\mathbb {C} }}$ or ${\displaystyle Y={\mathbb {R} }}$) so that ${\displaystyle L(X,Y)}$ is the dual space ${\displaystyle X^{*}}$ of ${\displaystyle X}$, the norm on ${\displaystyle L(X,Y)}$ defines a topology on ${\displaystyle X^{*}}$ which turns out to be stronger than its weak-*topology.

Theorem 1: Let ${\displaystyle X}$ and ${\displaystyle Y}$ be normed spaces, and associate to each ${\displaystyle f\in L(X,Y)}$ the number:

${\displaystyle \left\|f\right\|=\sup\{\left|f(x)\right|:x\in X,\left\|x\right\|\leq 1\}}$

We first establish that ${\displaystyle L(X,Y)}$ is bounded (using the triangle inequality), and complete (using Cauchy sequences) using our definition of ${\displaystyle \|f\|}$, thereby making ${\displaystyle L(X,Y)}$ a normed space. If ${\displaystyle Y}$ is a Banach space, so is ${\displaystyle L(X,Y)}$.[2]

Proof:

1. A subset of a normed space is bounded if and only if it lies in some multiple of the unit sphere; thus ${\displaystyle \lVert f\rVert <\infty }$ for every ${\displaystyle f\in L(X,Y)}$
if ${\displaystyle \alpha }$ is a scalar, then ${\displaystyle (\alpha f)(x)=\alpha \cdot fx}$ so that
${\displaystyle \|\alpha f\|=|\alpha |\|f\|}$
The triangle inequality in ${\displaystyle Y}$ shows that
{\displaystyle {\begin{aligned}\|(f_{1}+f_{2})x\|&=\|f_{1}x+f_{2}x\|\leq \|f_{1}x\|+\|f_{2}x\|\\&\leq (\|f_{1}\|+\|f_{2}\|)\|x\|\leq \|f_{1}\|+\|f_{2}\|\end{aligned}}}
for every ${\displaystyle x\in X}$ with ${\displaystyle \|x\|\leq 1}$. Thus
${\displaystyle \|f_{1}+f_{2}\|\leq \|f_{1}\|+\|f_{2}\|}$
If ${\displaystyle f\neq 0}$, then ${\displaystyle fx\neq 0}$ for some ${\displaystyle x\in X}$; hence ${\displaystyle \|f\|>0}$. Thus, ${\displaystyle L(X,Y)}$ is a normed space.[3]
2. Assume now that ${\displaystyle Y}$ is complete, and that ${\displaystyle \{f_{n}\}}$ is a Cauchy sequence in ${\displaystyle L(X,Y)}$.
Since
${\displaystyle \|f_{n}x-f_{m}x\|\leq \|f_{n}-f_{m}\|\|x\|}$
and it is assumed that ${\displaystyle \|f_{n}-f_{m}\|\to 0}$ as n and m tend to ${\displaystyle \infty }$, ${\displaystyle \{f_{n}x\}}$ is a Cauchy sequence in ${\displaystyle Y}$ for every ${\displaystyle x\in X}$.
Hence
${\displaystyle fx=\lim _{n\to \infty }f_{n}x}$
exists. It is clear that ${\displaystyle f:X\to Y}$ is linear. If ${\displaystyle \varepsilon >0}$, ${\displaystyle \|f_{n}-f_{m}\|\|x\|\leq \varepsilon \|x\|}$ for sufficiently large n and m. It follows
${\displaystyle \|fx-f_{m}x\|\leq \varepsilon \|x\|}$
for sufficiently large m.
Hence ${\displaystyle \|fx\|\leq (\|f_{m}\|+\varepsilon )\|x\|}$, so that ${\displaystyle f\in L(X,Y)}$ and ${\displaystyle \|f-f_{m}\|\leq \varepsilon }$.
Thus ${\displaystyle f_{m}\to f}$ in the norm of ${\displaystyle L(X,Y)}$. This establishes the completeness of ${\displaystyle L(X,Y)}$[4]

Theorem 2: Now suppose ${\displaystyle B}$ is the closed unit ball of normed space ${\displaystyle X}$. Define

${\displaystyle \|x^{*}\|=\sup\{|\langle {x,x^{*}}\rangle |:x\in B\}}$

for every ${\displaystyle x^{*}\in X^{*}}$

(a) This norm makes ${\displaystyle X^{*}}$ into a Banach space.[5]
(b) Let ${\displaystyle B^{*}}$ be the closed unit ball of ${\displaystyle X^{*}}$. For every ${\displaystyle x\in X}$,
${\displaystyle \|x\|=\sup\{|\langle {x,x^{*}}\rangle |:x^{*}\in B^{*}\}.}$
Consequently, ${\displaystyle x^{*}\to \langle {x,x^{*}}\rangle }$ is a bounded linear functional on ${\displaystyle X^{*}}$, of norm ${\displaystyle \|x\|}$.
(c) ${\displaystyle B^{*}}$ is weak*-compact.
Proof
Since ${\displaystyle L(X,Y)=X^{*}}$, when ${\displaystyle Y}$ is the scalar field, (a) is a corollary of Theorem 1.
Fix ${\displaystyle x\in X}$. There exists[6] ${\displaystyle y^{*}\in B^{*}}$ such that
${\displaystyle \langle {x,y^{*}}\rangle =\|x\|.}$
but,
${\displaystyle |\langle {x,x^{*}}\rangle |\leq \|x\|\|x^{*}\|\leq \|x\|}$
for every ${\displaystyle x^{*}\in B^{*}}$. (b) follows from the above.
Since the open unit ball ${\displaystyle U}$ of ${\displaystyle X}$ is dense in ${\displaystyle B}$, the definition of ${\displaystyle \|x^{*}\|}$ shows that ${\displaystyle x^{*}\in B^{*}}$ if and only if ${\displaystyle |\langle {x,x^{*}}\rangle |\leq 1}$ for every ${\displaystyle x\in U}$.
The proof for (c)[7] now follows directly.[8]

### The second dual of a Banach space is an isometric isomorphism

The normed dual ${\displaystyle X^{*}}$ of a Banach space ${\displaystyle X}$ is also a Banach space, which means it has a normed dual, ${\displaystyle X^{**}}$, of its own.

By part (b) of Theorem 2, every ${\displaystyle x\in X}$ defines a unique ${\displaystyle \phi \in X^{**}}$ by equation

${\displaystyle \langle {x,x^{*}}\rangle =\langle {x^{*},\phi x}\rangle \;\;\;\;\;\;\;\;(x^{*}\in X^{*});}$

and

${\displaystyle \lVert \phi x\rVert =\lVert x\rVert \;\;\;\;\;\;\;\;(x\in X).}$

It follows from the first and second equation that ${\displaystyle \phi :X\to X^{**}}$ is linear and ${\displaystyle \phi }$ is an isometry. Given that ${\displaystyle X}$ is assumed to be complete, ${\displaystyle \phi (X)}$ is closed in ${\displaystyle X^{**}}$.

Thus, ${\displaystyle \phi }$ is an isometric isomorphism onto a closed subspace of ${\displaystyle X^{**}}$.[9]

The members of ${\displaystyle \phi (x)}$ are exactly the linear functionals on ${\displaystyle X^{*}}$ that are continuous with respect to its weak*-topology. Since the norm topology of ${\displaystyle X^{*}}$ is stronger, may happen that ${\displaystyle \phi (X)}$ is a proper subspace of ${\displaystyle X^{**}}$.

However, there are many important spaces, such as the Lp spaces with ${\displaystyle 1, where ${\displaystyle \phi (X)=X^{**}}$; these are called reflexive.

It is stressed that, for ${\displaystyle X}$ to be reflexive, the existence of some isometric isomorphism ${\displaystyle \phi }$ of ${\displaystyle X}$ onto ${\displaystyle X^{**}}$ is not enough; it is crucial that ${\displaystyle \phi }$ satisfies first equation in this section.[10]

### Mathematical Optimization

Let ${\displaystyle ||\cdot ||}$ be a norm on ${\displaystyle \mathbb {R} ^{n}}$. The associated dual norm, denoted ${\displaystyle \|\cdot \|_{*}}$, is defined as

${\displaystyle ||z||_{*}=\sup\{z^{\intercal }x\;|\;||x||\leq 1\}.}$

(This can be shown to be a norm.) The dual norm can be interpreted as the operator norm of ${\displaystyle z^{\intercal }}$, interpreted as a ${\displaystyle 1\times n}$ matrix, with the norm ${\displaystyle ||\cdot ||}$ on ${\displaystyle \mathbb {R} ^{n}}$, and the absolute value on ${\displaystyle \mathbb {R} }$:

${\displaystyle ||z||_{*}=\sup\{|z^{\intercal }x|\;|\;||x||\leq 1\}.}$

From the definition of dual norm we have the inequality

${\displaystyle z^{\intercal }x=||x||(z^{\intercal }{\frac {x}{||x||}})\leq \lVert x\rVert \lVert z\rVert _{*}}$

which holds for all x and z.[11] The dual of the dual norm is the original norm: we have ${\displaystyle \lVert x\rVert _{**}=\lVert x\rVert }$ for all x. (This need not hold in infinite-dimensional vector spaces.)

The dual of the Euclidean norm is the Euclidean norm, since

${\displaystyle \sup\{z^{\intercal }x\;|\;\lVert x\rVert _{2}\leq 1\}=\lVert z\rVert _{2}.}$

(This follows from the Cauchy–Schwarz inequality; for nonzero z, the value of x that maximises ${\displaystyle z^{\intercal }x}$ over ${\displaystyle \lVert x\rVert _{2}\leq 1}$ is ${\displaystyle {\frac {z}{\lVert z\rVert _{2}}}}$.)

The dual of the ${\displaystyle \ell _{1}}$-norm is the ${\displaystyle \ell _{\infty }}$-norm:

${\displaystyle \sup\{z^{\intercal }x\;|\;\lVert x\rVert _{\infty }\leq 1\}=\sum _{i=1}^{n}|z_{i}|=\lVert z\rVert _{1},}$

and the dual of the ${\displaystyle \ell _{\infty }}$-norm is the ${\displaystyle \ell _{1}}$-norm.

More generally, Hölder's inequality shows that the dual of the ${\displaystyle \ell _{p}}$-norm is the ${\displaystyle \ell _{q}}$-norm, where, q satisfies ${\displaystyle {\frac {1}{p}}+{\frac {1}{q}}=1}$, i.e., ${\displaystyle q={\frac {p}{p-1}}.}$

As another example, consider the ${\displaystyle \ell _{2}}$- or spectral norm on ${\displaystyle \mathbb {R} ^{m\times n}}$. The associated dual norm is

${\displaystyle \lVert Z\rVert _{2*}=\sup\{\mathrm {\bf {tr}} (Z^{\intercal }X)|\;\lVert X\rVert _{2}\leq 1\},}$

which turns out to be the sum of the singular values,

${\displaystyle \lVert Z\rVert _{2*}=\sigma _{1}(Z)+\ldots +\sigma _{r}(Z)=\mathrm {\bf {tr}} (Z^{\intercal }Z)^{\frac {1}{2}},}$

where ${\displaystyle r=\mathrm {\bf {rank}} \;Z}$. This norm is sometimes called the nuclear norm.[12]

## Examples

### Dual norm for matrices

The Frobenius norm defined by
${\displaystyle \left\|A\right\|_{\text{F}}={\sqrt {\sum _{i=1}^{m}\sum _{j=1}^{n}\left|a_{ij}\right|^{2}}}={\sqrt {\operatorname {trace} (A^{{}^{*}}A)}}={\sqrt {\sum _{i=1}^{\min\{m,\,n\}}\sigma _{i}^{2}}}}$
is self-dual, i.e., its dual norm is ${\displaystyle \left\|\cdot \right\|'_{\text{F}}=\left\|\cdot \right\|_{\text{F}}}$.
The spectral norm, a special case of the induced norm when ${\displaystyle p=2}$, is defined by the maximum singular values of a matrix, i.e.,
${\displaystyle \left\|A\right\|_{2}=\sigma _{max}(A)}$,
has the nuclear norm as its dual norm, which is defined by ${\displaystyle \|B\|'_{2}=\sum _{i}\sigma _{i}(B)}$ for any matrix ${\displaystyle B}$ where ${\displaystyle \sigma _{i}(B)}$ denote the singular values[citation needed].

## Notes

1. ^ Each ${\displaystyle L(X,Y)}$ is a vector space, with the usual definitions of addition and scalar multiplication of functions; this only depends on the vector space structure of ${\displaystyle Y}$, not ${\displaystyle X}$.
2. ^ Rudin 1991, p. 92
3. ^ Rudin 1991, p. 93
4. ^ Rudin 1991, p. 93
5. ^ Aliprantis 2005, p. 230
6.7 Definition The norm dual ${\displaystyle X^{*}}$ of a normed space ${\displaystyle (X,||\cdot ||)}$ is Banach space ${\displaystyle L(X,\mathbb {R} )}$. The operator norm on ${\displaystyle X^{*}}$ is also called the dual norm, also denoted ${\displaystyle ||\cdot ||}$. That is,
${\displaystyle ||x^{*}||=\sup _{||x||\leq 1}|\langle {x^{*},x}\rangle |=\sup _{||x||=1}|\langle {x^{*},x}\rangle |}$
The dual space is indeed a Banach space by Theorem 6.6.
6. ^ Rudin 1991, Theorem 3.3 Corollary, p. 59
7. ^ Rudin 1991, Theorem 3.15 The Banach–Alaoglu theorem algorithm, p. 68
8. ^ Rudin 1991, p. 94
9. ^ Rudin 1991, Theorem 4.5 The second dual of a Banach space, p. 95
10. ^ Rudin 1991, p. 95
11. ^ This inequality is tight, in the following sense: for any x there is a z for which the inequality holds with equality. (Similarly, for any z there is an x that gives equality.)
12. ^