In functional analysis , the dual norm is a measure of the "size" of each continuous linear functional defined on a normed vector space .
Definition [ edit ]
Let
X
{\displaystyle X}
be a normed vector space with norm
|
⋅
|
{\displaystyle |\cdot |}
and let
X
∗
{\displaystyle X^{*}}
be the dual space . The dual norm of a continuous linear functional
f
{\displaystyle f}
belonging to
X
∗
{\displaystyle X^{*}}
is defined to be the real number
‖
f
‖
:=
sup
{
|
f
(
x
)
|
:
x
∈
X
,
|
x
|
≤
1
}
{\displaystyle \left\|f\right\|:=\sup\{\left|f(x)\right|:x\in X,\left|x\right|\leq 1\}}
where
s
u
p
{\displaystyle sup}
denotes the supremum .[1]
The map
f
↦
‖
f
‖
{\displaystyle f\mapsto \|f\|}
defines a norm on
X
∗
{\displaystyle X^{*}}
. (See Theorems 1 and 2 below.)
The dual norm is a special case of the operator norm defined for each (bounded) linear map between normed vector spaces.
The topology on
X
∗
{\displaystyle X^{*}}
induced by
|
⋅
|
{\displaystyle |\cdot |}
turns out to be as strong as the weak-* topology on
X
∗
{\displaystyle X^{*}}
.
If the ground field of
X
{\displaystyle X}
is complete then
X
∗
{\displaystyle X^{*}}
is a Banach space .
The double dual of a normed linear space [ edit ]
The double dual (or second dual)
X
∗
∗
{\displaystyle X^{**}}
of
X
{\displaystyle X}
is the dual of the normed vector space
X
∗
{\displaystyle X^{*}}
.
There is a natural map
φ
:
X
→
X
∗
∗
{\displaystyle \varphi :X\to X^{**}}
.
Indeed, for each
w
∗
{\displaystyle w^{*}}
in
X
∗
{\displaystyle X^{*}}
define
φ
(
v
)
(
w
∗
)
:=
w
∗
(
v
)
.
{\displaystyle \varphi (v)(w^{*}):=w^{*}(v).}
The map
φ
{\displaystyle \varphi }
is linear , injective , and distance preserving .[2]
In particular, if
X
{\displaystyle X}
is complete (i.e. a Banach space), then
φ
{\displaystyle \varphi }
is an isometry onto a closed
subspace of
X
∗
∗
{\displaystyle X^{**}}
.[3]
In general, the map
φ
{\displaystyle \varphi }
is not surjective. For example, if
X
{\displaystyle X}
is the Banach space
L
∞
{\displaystyle L^{\infty }}
consisting of bounded functions on the real line with the supremum norm, then the map
φ
{\displaystyle \varphi }
is not surjective.
(See
L
p
{\displaystyle L^{p}}
space ).
If
φ
{\displaystyle \varphi }
is surjective, then
X
{\displaystyle X}
is said to be a reflexive Banach space .
If
1
<
p
<
∞
{\displaystyle 1<p<\infty }
, then the space
L
p
{\displaystyle L^{p}}
is a reflexive Banach space.
Mathematical Optimization [ edit ]
Let
|
|
⋅
|
|
{\displaystyle ||\cdot ||}
be a norm on
R
n
{\displaystyle \mathbb {R} ^{n}}
. The associated dual norm , denoted
‖
⋅
‖
∗
{\displaystyle \|\cdot \|_{*}}
, is defined as
|
|
z
|
|
∗
=
sup
{
z
⊺
x
|
|
|
x
|
|
≤
1
}
.
{\displaystyle ||z||_{*}=\sup\{z^{\intercal }x\;|\;||x||\leq 1\}.}
(This can be shown to be a norm.) The dual norm can be interpreted as the operator norm of
z
⊺
{\displaystyle z^{\intercal }}
, interpreted as a
1
×
n
{\displaystyle 1\times n}
matrix, with the norm
|
|
⋅
|
|
{\displaystyle ||\cdot ||}
on
R
n
{\displaystyle \mathbb {R} ^{n}}
, and the absolute value on
R
{\displaystyle \mathbb {R} }
:
|
|
z
|
|
∗
=
sup
{
|
z
⊺
x
|
|
|
|
x
|
|
≤
1
}
.
{\displaystyle ||z||_{*}=\sup\{|z^{\intercal }x|\;|\;||x||\leq 1\}.}
From the definition of dual norm we have the inequality
z
⊺
x
=
|
|
x
|
|
(
z
⊺
x
|
|
x
|
|
)
≤
‖
x
‖
‖
z
‖
∗
{\displaystyle z^{\intercal }x=||x||(z^{\intercal }{\frac {x}{||x||}})\leq \lVert x\rVert \lVert z\rVert _{*}}
which holds for all x and z .[4] The dual of the dual norm is the original norm: we have
‖
x
‖
∗
∗
=
‖
x
‖
{\displaystyle \lVert x\rVert _{**}=\lVert x\rVert }
for all x . (This need not hold in infinite-dimensional vector spaces.)
The dual of the Euclidean norm is the Euclidean norm, since
sup
{
z
⊺
x
|
‖
x
‖
2
≤
1
}
=
‖
z
‖
2
.
{\displaystyle \sup\{z^{\intercal }x\;|\;\lVert x\rVert _{2}\leq 1\}=\lVert z\rVert _{2}.}
(This follows from the Cauchy–Schwarz inequality ; for nonzero z , the value of x that maximises
z
⊺
x
{\displaystyle z^{\intercal }x}
over
‖
x
‖
2
≤
1
{\displaystyle \lVert x\rVert _{2}\leq 1}
is
z
‖
z
‖
2
{\displaystyle {\frac {z}{\lVert z\rVert _{2}}}}
.)
The dual of the
ℓ
1
{\displaystyle \ell _{1}}
-norm is the
ℓ
∞
{\displaystyle \ell _{\infty }}
-norm:
sup
{
z
⊺
x
|
‖
x
‖
∞
≤
1
}
=
∑
i
=
1
n
|
z
i
|
=
‖
z
‖
1
,
{\displaystyle \sup\{z^{\intercal }x\;|\;\lVert x\rVert _{\infty }\leq 1\}=\sum _{i=1}^{n}|z_{i}|=\lVert z\rVert _{1},}
and the dual of the
ℓ
∞
{\displaystyle \ell _{\infty }}
-norm is the
ℓ
1
{\displaystyle \ell _{1}}
-norm.
More generally, Hölder's inequality shows that the dual of the
ℓ
p
{\displaystyle \ell _{p}}
-norm is the
ℓ
q
{\displaystyle \ell _{q}}
-norm, where, q satisfies
1
p
+
1
q
=
1
{\displaystyle {\frac {1}{p}}+{\frac {1}{q}}=1}
, i.e.,
q
=
p
p
−
1
.
{\displaystyle q={\frac {p}{p-1}}.}
As another example, consider the
ℓ
2
{\displaystyle \ell _{2}}
- or spectral norm on
R
m
×
n
{\displaystyle \mathbb {R} ^{m\times n}}
. The associated dual norm is
‖
Z
‖
2
∗
=
sup
{
t
r
(
Z
⊺
X
)
|
‖
X
‖
2
≤
1
}
,
{\displaystyle \lVert Z\rVert _{2*}=\sup\{\mathrm {\bf {tr}} (Z^{\intercal }X)|\;\lVert X\rVert _{2}\leq 1\},}
which turns out to be the sum of the singular values,
‖
Z
‖
2
∗
=
σ
1
(
Z
)
+
…
+
σ
r
(
Z
)
=
t
r
(
Z
⊺
Z
)
1
2
,
{\displaystyle \lVert Z\rVert _{2*}=\sigma _{1}(Z)+\ldots +\sigma _{r}(Z)=\mathrm {\bf {tr}} (Z^{\intercal }Z)^{\frac {1}{2}},}
where
r
=
r
a
n
k
Z
{\displaystyle r=\mathrm {\bf {rank}} \;Z}
. This norm is sometimes called the nuclear norm .[5]
Examples [ edit ]
Dual norm for matrices [ edit ]
The Frobenius norm defined by
‖
A
‖
F
=
∑
i
=
1
m
∑
j
=
1
n
|
a
i
j
|
2
=
trace
(
A
∗
A
)
=
∑
i
=
1
min
{
m
,
n
}
σ
i
2
{\displaystyle \left\|A\right\|_{\text{F}}={\sqrt {\sum _{i=1}^{m}\sum _{j=1}^{n}\left|a_{ij}\right|^{2}}}={\sqrt {\operatorname {trace} (A^{{}^{*}}A)}}={\sqrt {\sum _{i=1}^{\min\{m,\,n\}}\sigma _{i}^{2}}}}
is self-dual, i.e., its dual norm is
‖
⋅
‖
F
′
=
‖
⋅
‖
F
{\displaystyle \left\|\cdot \right\|'_{\text{F}}=\left\|\cdot \right\|_{\text{F}}}
.
The spectral norm , a special case of the induced norm when
p
=
2
{\displaystyle p=2}
, is defined by the maximum singular values of a matrix, i.e.,
‖
A
‖
2
=
σ
m
a
x
(
A
)
{\displaystyle \left\|A\right\|_{2}=\sigma _{max}(A)}
,
has the nuclear norm as its dual norm, which is defined by
‖
B
‖
2
′
=
∑
i
σ
i
(
B
)
{\displaystyle \|B\|'_{2}=\sum _{i}\sigma _{i}(B)}
for any matrix
B
{\displaystyle B}
where
σ
i
(
B
)
{\displaystyle \sigma _{i}(B)}
denote the singular values[citation needed ] .
Some basic results about the operator norm [ edit ]
More generally, let
X
{\displaystyle X}
and
Y
{\displaystyle Y}
be topological vector spaces, and
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
[6] be the collection of all bounded linear mappings (or operators ) of
X
{\displaystyle X}
into
Y
{\displaystyle Y}
. In the case where
X
{\displaystyle X}
and
Y
{\displaystyle Y}
are normed vector spaces,
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
can be normed in a natural way.
When
Y
{\displaystyle Y}
is a scalar field (i.e.
Y
=
C
{\displaystyle Y={\mathbb {C} }}
or
Y
=
R
{\displaystyle Y={\mathbb {R} }}
) so that
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
is the dual space
X
∗
{\displaystyle X^{*}}
of
X
{\displaystyle X}
.
Theorem 1 : Let
X
{\displaystyle X}
and
Y
{\displaystyle Y}
be normed spaces, and associate to each
f
∈
L
(
X
,
Y
)
{\displaystyle f\in L(X,Y)}
the number:
‖
f
‖
=
sup
{
|
f
(
x
)
|
:
x
∈
X
,
‖
x
‖
≤
1
}
{\displaystyle \left\|f\right\|=\sup\{\left|f(x)\right|:x\in X,\left\|x\right\|\leq 1\}}
We first establish that
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
is bounded (using the triangle inequality), and complete (using Cauchy sequences ) using our definition of
‖
f
‖
{\displaystyle \|f\|}
, thereby making
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
a normed space. If
Y
{\displaystyle Y}
is a Banach space, so is
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
.[7]
Proof :
A subset of a normed space is bounded if and only if it lies in some multiple of the unit sphere ; thus
‖
f
‖
<
∞
{\displaystyle \lVert f\rVert <\infty }
for every
f
∈
L
(
X
,
Y
)
{\displaystyle f\in L(X,Y)}
if
α
{\displaystyle \alpha }
is a scalar, then
(
α
f
)
(
x
)
=
α
⋅
f
x
{\displaystyle (\alpha f)(x)=\alpha \cdot fx}
so that
‖
α
f
‖
=
|
α
|
‖
f
‖
{\displaystyle \|\alpha f\|=|\alpha |\|f\|}
The triangle inequality in
Y
{\displaystyle Y}
shows that
‖
(
f
1
+
f
2
)
x
‖
=
‖
f
1
x
+
f
2
x
‖
≤
‖
f
1
x
‖
+
‖
f
2
x
‖
≤
(
‖
f
1
‖
+
‖
f
2
‖
)
‖
x
‖
≤
‖
f
1
‖
+
‖
f
2
‖
{\displaystyle {\begin{aligned}\|(f_{1}+f_{2})x\|&=\|f_{1}x+f_{2}x\|\leq \|f_{1}x\|+\|f_{2}x\|\\&\leq (\|f_{1}\|+\|f_{2}\|)\|x\|\leq \|f_{1}\|+\|f_{2}\|\end{aligned}}}
for every
x
∈
X
{\displaystyle x\in X}
with
‖
x
‖
≤
1
{\displaystyle \|x\|\leq 1}
. Thus
‖
f
1
+
f
2
‖
≤
‖
f
1
‖
+
‖
f
2
‖
{\displaystyle \|f_{1}+f_{2}\|\leq \|f_{1}\|+\|f_{2}\|}
If
f
≠
0
{\displaystyle f\neq 0}
, then
f
x
≠
0
{\displaystyle fx\neq 0}
for some
x
∈
X
{\displaystyle x\in X}
; hence
‖
f
‖
>
0
{\displaystyle \|f\|>0}
. Thus,
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
is a normed space.[8]
Assume now that
Y
{\displaystyle Y}
is complete, and that
{
f
n
}
{\displaystyle \{f_{n}\}}
is a Cauchy sequence in
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
.
Since
‖
f
n
x
−
f
m
x
‖
≤
‖
f
n
−
f
m
‖
‖
x
‖
{\displaystyle \|f_{n}x-f_{m}x\|\leq \|f_{n}-f_{m}\|\|x\|}
and it is assumed that
‖
f
n
−
f
m
‖
→
0
{\displaystyle \|f_{n}-f_{m}\|\to 0}
as n and m tend to
∞
{\displaystyle \infty }
,
{
f
n
x
}
{\displaystyle \{f_{n}x\}}
is a Cauchy sequence in
Y
{\displaystyle Y}
for every
x
∈
X
{\displaystyle x\in X}
.
Hence
f
x
=
lim
n
→
∞
f
n
x
{\displaystyle fx=\lim _{n\to \infty }f_{n}x}
exists. It is clear that
f
:
X
→
Y
{\displaystyle f:X\to Y}
is linear. If
ε
>
0
{\displaystyle \varepsilon >0}
,
‖
f
n
−
f
m
‖
‖
x
‖
≤
ε
‖
x
‖
{\displaystyle \|f_{n}-f_{m}\|\|x\|\leq \varepsilon \|x\|}
for sufficiently large n and m . It follows
‖
f
x
−
f
m
x
‖
≤
ε
‖
x
‖
{\displaystyle \|fx-f_{m}x\|\leq \varepsilon \|x\|}
for sufficiently large m .
Hence
‖
f
x
‖
≤
(
‖
f
m
‖
+
ε
)
‖
x
‖
{\displaystyle \|fx\|\leq (\|f_{m}\|+\varepsilon )\|x\|}
, so that
f
∈
L
(
X
,
Y
)
{\displaystyle f\in L(X,Y)}
and
‖
f
−
f
m
‖
≤
ε
{\displaystyle \|f-f_{m}\|\leq \varepsilon }
.
Thus
f
m
→
f
{\displaystyle f_{m}\to f}
in the norm of
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
. This establishes the completeness of
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
[9]
Theorem 2 : Now suppose
B
{\displaystyle B}
is the closed unit ball of normed space
X
{\displaystyle X}
. Define
‖
x
∗
‖
=
sup
{
|
⟨
x
,
x
∗
⟩
|
:
x
∈
B
}
{\displaystyle \|x^{*}\|=\sup\{|\langle {x,x^{*}}\rangle |:x\in B\}}
for every
x
∗
∈
X
∗
{\displaystyle x^{*}\in X^{*}}
(a) This norm makes
X
∗
{\displaystyle X^{*}}
into a Banach space.[10]
(b) Let
B
∗
{\displaystyle B^{*}}
be the closed unit ball of
X
∗
{\displaystyle X^{*}}
. For every
x
∈
X
{\displaystyle x\in X}
,
‖
x
‖
=
sup
{
|
⟨
x
,
x
∗
⟩
|
:
x
∗
∈
B
∗
}
.
{\displaystyle \|x\|=\sup\{|\langle {x,x^{*}}\rangle |:x^{*}\in B^{*}\}.}
Consequently,
x
∗
→
⟨
x
,
x
∗
⟩
{\displaystyle x^{*}\to \langle {x,x^{*}}\rangle }
is a bounded linear functional on
X
∗
{\displaystyle X^{*}}
, of norm
‖
x
‖
{\displaystyle \|x\|}
.
(c)
B
∗
{\displaystyle B^{*}}
is weak*-compact.
Proof
Since
L
(
X
,
Y
)
=
X
∗
{\displaystyle L(X,Y)=X^{*}}
, when
Y
{\displaystyle Y}
is the scalar field , (a) is a corollary of Theorem 1.
Fix
x
∈
X
{\displaystyle x\in X}
. There exists[11]
y
∗
∈
B
∗
{\displaystyle y^{*}\in B^{*}}
such that
⟨
x
,
y
∗
⟩
=
‖
x
‖
.
{\displaystyle \langle {x,y^{*}}\rangle =\|x\|.}
but,
|
⟨
x
,
x
∗
⟩
|
≤
‖
x
‖
‖
x
∗
‖
≤
‖
x
‖
{\displaystyle |\langle {x,x^{*}}\rangle |\leq \|x\|\|x^{*}\|\leq \|x\|}
for every
x
∗
∈
B
∗
{\displaystyle x^{*}\in B^{*}}
. (b) follows from the above.
Since the open unit ball
U
{\displaystyle U}
of
X
{\displaystyle X}
is dense in
B
{\displaystyle B}
, the definition of
‖
x
∗
‖
{\displaystyle \|x^{*}\|}
shows that
x
∗
∈
B
∗
{\displaystyle x^{*}\in B^{*}}
if and only if
|
⟨
x
,
x
∗
⟩
|
≤
1
{\displaystyle |\langle {x,x^{*}}\rangle |\leq 1}
for every
x
∈
U
{\displaystyle x\in U}
.
The proof for (c)[12] now follows directly.[13]
See also [ edit ]
^ Rudin 1991 , p. 87
^ Rudin 1991 , section 4.5, p. 95
^ Rudin 1991 , p. 95
^ This inequality is tight, in the following sense: for any x there is a z for which the inequality holds with equality. (Similarly, for any z there is an x that gives equality.)
^ Boyd & Vandenberghe 2004 , p. 637
^ Each
L
(
X
,
Y
)
{\displaystyle L(X,Y)}
is a vector space , with the usual definitions of addition and scalar multiplication of functions; this only depends on the vector space structure of
Y
{\displaystyle Y}
, not
X
{\displaystyle X}
.
^ Rudin 1991 , p. 92
^ Rudin 1991 , p. 93
^ Rudin 1991 , p. 93
^ Aliprantis 2005 , p. 230 6.7 Definition The norm dual
X
∗
{\displaystyle X^{*}}
of a normed space
(
X
,
|
|
⋅
|
|
)
{\displaystyle (X,||\cdot ||)}
is Banach space
L
(
X
,
R
)
{\displaystyle L(X,\mathbb {R} )}
. The operator norm on
X
∗
{\displaystyle X^{*}}
is also called the dual norm , also denoted
|
|
⋅
|
|
{\displaystyle ||\cdot ||}
. That is,
|
|
x
∗
|
|
=
sup
|
|
x
|
|
≤
1
|
⟨
x
∗
,
x
⟩
|
=
sup
|
|
x
|
|
=
1
|
⟨
x
∗
,
x
⟩
|
{\displaystyle ||x^{*}||=\sup _{||x||\leq 1}|\langle {x^{*},x}\rangle |=\sup _{||x||=1}|\langle {x^{*},x}\rangle |}
The dual space is indeed a Banach space by Theorem 6.6.
^ Rudin 1991 , Theorem 3.3 Corollary , p. 59
^ Rudin 1991 , Theorem 3.15 The Banach–Alaoglu theorem algorithm , p. 68
^ Rudin 1991 , p. 94
References [ edit ]
Aliprantis, Charalambos D.; Border, Kim C. (2007). Infinite Dimensional Analysis: A Hitchhiker's Guide (3rd ed.). Springer. ISBN 9783540326960 .
Boyd, Stephen ; Vandenberghe, Lieven (2004). Convex Optimization . Cambridge University Press . ISBN 9780521833783 .
Kolmogorov, A.N. ; Fomin, S.V. (1957). Elements of the Theory of Functions and Functional Analysis, Volume 1: Metric and Normed Spaces . Rochester: Graylock Press.
Rudin, Walter (1991), Functional analysis , McGraw-Hill Science, ISBN 978-0-07-054236-5 .
External links [ edit ]