Let
X
{\displaystyle X}
be a real topological vector space and let
X
∗
{\displaystyle X^{*}}
be the dual space to
X
{\displaystyle X}
. Denote by
⟨
⋅
,
⋅
⟩
:
X
∗
×
X
→
R
{\displaystyle \langle \cdot ,\cdot \rangle :X^{*}\times X\to \mathbb {R} }
the canonical dual pairing , which is defined by
(
x
∗
,
x
)
↦
x
∗
(
x
)
.
{\displaystyle \left(x^{*},x\right)\mapsto x^{*}(x ).}
For a function
f
:
X
→
R
∪
{
−
∞
,
+
∞
}
{\displaystyle f:X\to \mathbb {R} \cup \{-\infty ,+\infty \}}
taking values on the extended real number line , its convex conjugate is the function
f
∗
:
X
∗
→
R
∪
{
−
∞
,
+
∞
}
{\displaystyle f^{*}:X^{*}\to \mathbb {R} \cup \{-\infty ,+\infty \}}
whose value at
x
∗
∈
X
∗
{\displaystyle x^{*}\in X^{*}}
is defined to be the supremum :
f
∗
(
x
∗
)
:=
sup
{
⟨
x
∗
,
x
⟩
−
f
(
x
)
:
x
∈
X
}
,
{\displaystyle f^{*}\left(x^{*}\right):=\sup \left\{\left\langle x^{*},x\right\rangle -f(x )~\colon ~x\in X\right\},}
or, equivalently, in terms of the infimum :
f
∗
(
x
∗
)
:=
−
inf
{
f
(
x
)
−
⟨
x
∗
,
x
⟩
:
x
∈
X
}
.
{\displaystyle f^{*}\left(x^{*}\right):=-\inf \left\{f(x )-\left\langle x^{*},x\right\rangle ~\colon ~x\in X\right\}.}
This definition can be interpreted as an encoding of the convex hull of the function's epigraph in terms of its supporting hyperplanes .[1 ]
For more examples, see § Table of selected convex conjugates .
The convex conjugate of an affine function
f
(
x
)
=
⟨
a
,
x
⟩
−
b
{\displaystyle f(x )=\left\langle a,x\right\rangle -b}
is
f
∗
(
x
∗
)
=
{
b
,
x
∗
=
a
+
∞
,
x
∗
≠
a
.
{\displaystyle f^{*}\left(x^{*}\right)={\begin{cases}b,&x^{*}=a\\+\infty ,&x^{*}\neq a.\end{cases}}}
The convex conjugate of a power function
f
(
x
)
=
1
p
|
x
|
p
,
1
<
p
<
∞
{\displaystyle f(x )={\frac {1}{p}}|x|^{p},1<p<\infty }
is
f
∗
(
x
∗
)
=
1
q
|
x
∗
|
q
,
1
<
q
<
∞
,
where
1
p
+
1
q
=
1.
{\displaystyle f^{*}\left(x^{*}\right)={\frac {1}{q}}|x^{*}|^{q},1<q<\infty ,{\text{where}}{\tfrac {1}{p}}+{\tfrac {1}{q}}=1.}
The convex conjugate of the absolute value function
f
(
x
)
=
|
x
|
{\displaystyle f(x )=\left|x\right|}
is
f
∗
(
x
∗
)
=
{
0
,
|
x
∗
|
≤
1
∞
,
|
x
∗
|
>
1.
{\displaystyle f^{*}\left(x^{*}\right)={\begin{cases}0,&\left|x^{*}\right|\leq 1\\\infty ,&\left|x^{*}\right|>1.\end{cases}}}
The convex conjugate of the exponential function
f
(
x
)
=
e
x
{\displaystyle f(x )=e^{x}}
is
f
∗
(
x
∗
)
=
{
x
∗
ln
x
∗
−
x
∗
,
x
∗
>
0
0
,
x
∗
=
0
∞
,
x
∗
<
0.
{\displaystyle f^{*}\left(x^{*}\right)={\begin{cases}x^{*}\ln x^{*}-x^{*},&x^{*}>0\\0,&x^{*}=0\\\infty ,&x^{*}<0.\end{cases}}}
The convex conjugate and Legendre transform of the exponential function agree except that the domain of the convex conjugate is strictly larger as the Legendre transform is only defined for positive real numbers.
Connection with expected shortfall (average value at risk)
edit
See this article for example.
Let F denote a cumulative distribution function of a random variable X . Then (integrating by parts),
f
(
x
)
:=
∫
−
∞
x
F
(
u
)
d
u
=
E
[
max
(
0
,
x
−
X
)
]
=
x
−
E
[
min
(
x
,
X
)
]
{\displaystyle f(x ):=\int _{-\infty }^{x}F(u )\,du=\operatorname {E} \left[\max(0,x-X)\right]=x-\operatorname {E} \left[\min(x,X)\right]}
has the convex conjugate
f
∗
(
p
)
=
∫
0
p
F
−
1
(
q
)
d
q
=
(
p
−
1
)
F
−
1
(
p
)
+
E
[
min
(
F
−
1
(
p
)
,
X
)
]
=
p
F
−
1
(
p
)
−
E
[
max
(
0
,
F
−
1
(
p
)
−
X
)
]
.
{\displaystyle f^{*}(p )=\int _{0}^{p}F^{-1}(q )\,dq=(p-1)F^{-1}(p )+\operatorname {E} \left[\min(F^{-1}(p ),X)\right]=pF^{-1}(p )-\operatorname {E} \left[\max(0,F^{-1}(p )-X)\right].}
Ordering
edit
A particular interpretation has the transform
f
inc
(
x
)
:=
arg
sup
t
t
⋅
x
−
∫
0
1
max
{
t
−
f
(
u
)
,
0
}
d
u
,
{\displaystyle f^{\text{inc}}(x ):=\arg \sup _{t}t\cdot x-\int _{0}^{1}\max\{t-f(u ),0\}\,du,}
as this is a nondecreasing rearrangement of the initial function f ; in particular,
f
inc
=
f
{\displaystyle f^{\text{inc}}=f}
for f nondecreasing.
The convex conjugate of a closed convex function is again a closed convex function. The convex conjugate of a polyhedral convex function (a convex function with polyhedral epigraph ) is again a polyhedral convex function.
Order reversing
edit
Declare that
f
≤
g
{\displaystyle f\leq g}
if and only if
f
(
x
)
≤
g
(
x
)
{\displaystyle f(x )\leq g(x )}
for all
x
.
{\displaystyle x.}
Then convex-conjugation is order-reversing , which by definition means that if
f
≤
g
{\displaystyle f\leq g}
then
f
∗
≥
g
∗
.
{\displaystyle f^{*}\geq g^{*}.}
For a family of functions
(
f
α
)
α
{\displaystyle \left(f_{\alpha }\right)_{\alpha }}
it follows from the fact that supremums may be interchanged that
(
inf
α
f
α
)
∗
(
x
∗
)
=
sup
α
f
α
∗
(
x
∗
)
,
{\displaystyle \left(\inf _{\alpha }f_{\alpha }\right)^{*}(x^{*})=\sup _{\alpha }f_{\alpha }^{*}(x^{*}),}
and from the max–min inequality that
(
sup
α
f
α
)
∗
(
x
∗
)
≤
inf
α
f
α
∗
(
x
∗
)
.
{\displaystyle \left(\sup _{\alpha }f_{\alpha }\right)^{*}(x^{*})\leq \inf _{\alpha }f_{\alpha }^{*}(x^{*}).}
Biconjugate
edit
The convex conjugate of a function is always lower semi-continuous . The biconjugate
f
∗
∗
{\displaystyle f^{**}}
(the convex conjugate of the convex conjugate) is also the closed convex hull , i.e. the largest lower semi-continuous convex function with
f
∗
∗
≤
f
.
{\displaystyle f^{**}\leq f.}
For proper functions
f
,
{\displaystyle f,}
f
=
f
∗
∗
{\displaystyle f=f^{**}}
if and only if
f
{\displaystyle f}
is convex and lower semi-continuous, by the Fenchel–Moreau theorem .
Fenchel's inequality
edit
For any function f and its convex conjugate f * , Fenchel's inequality (also known as the Fenchel–Young inequality ) holds for every
x
∈
X
{\displaystyle x\in X}
and
p
∈
X
∗
{\displaystyle p\in X^{*}}
:
⟨
p
,
x
⟩
≤
f
(
x
)
+
f
∗
(
p
)
.
{\displaystyle \left\langle p,x\right\rangle \leq f(x )+f^{*}(p ).}
Furthermore, the equality holds only when
p
∈
∂
f
(
x
)
{\displaystyle p\in \partial f(x )}
.
The proof follows from the definition of convex conjugate:
f
∗
(
p
)
=
sup
x
~
{
⟨
p
,
x
~
⟩
−
f
(
x
~
)
}
≥
⟨
p
,
x
⟩
−
f
(
x
)
.
{\displaystyle f^{*}(p )=\sup _{\tilde {x}}\left\{\langle p,{\tilde {x}}\rangle -f({\tilde {x}})\right\}\geq \langle p,x\rangle -f(x ).}
Convexity
edit
For two functions
f
0
{\displaystyle f_{0}}
and
f
1
{\displaystyle f_{1}}
and a number
0
≤
λ
≤
1
{\displaystyle 0\leq \lambda \leq 1}
the convexity relation
(
(
1
−
λ
)
f
0
+
λ
f
1
)
∗
≤
(
1
−
λ
)
f
0
∗
+
λ
f
1
∗
{\displaystyle \left((1-\lambda )f_{0}+\lambda f_{1}\right)^{*}\leq (1-\lambda )f_{0}^{*}+\lambda f_{1}^{*}}
holds. The
∗
{\displaystyle {*}}
operation is a convex mapping itself.
Infimal convolution
edit
The infimal convolution (or epi-sum) of two functions
f
{\displaystyle f}
and
g
{\displaystyle g}
is defined as
(
f
◻
g
)
(
x
)
=
inf
{
f
(
x
−
y
)
+
g
(
y
)
∣
y
∈
R
n
}
.
{\displaystyle \left(f\operatorname {\Box } g\right)(x )=\inf \left\{f(x-y)+g(y )\mid y\in \mathbb {R} ^{n}\right\}.}
Let
f
1
,
…
,
f
m
{\displaystyle f_{1},\ldots ,f_{m}}
be proper , convex and lower semicontinuous functions on
R
n
.
{\displaystyle \mathbb {R} ^{n}.}
Then the infimal convolution is convex and lower semicontinuous (but not necessarily proper),[2 ] and satisfies
(
f
1
◻
⋯
◻
f
m
)
∗
=
f
1
∗
+
⋯
+
f
m
∗
.
{\displaystyle \left(f_{1}\operatorname {\Box } \cdots \operatorname {\Box } f_{m}\right)^{*}=f_{1}^{*}+\cdots +f_{m}^{*}.}
The infimal convolution of two functions has a geometric interpretation: The (strict) epigraph of the infimal convolution of two functions is the Minkowski sum of the (strict) epigraphs of those functions.[3 ]
Maximizing argument
edit
If the function
f
{\displaystyle f}
is differentiable, then its derivative is the maximizing argument in the computation of the convex conjugate:
f
′
(
x
)
=
x
∗
(
x
)
:=
arg
sup
x
∗
⟨
x
,
x
∗
⟩
−
f
∗
(
x
∗
)
{\displaystyle f^{\prime }(x )=x^{*}(x ):=\arg \sup _{x^{*}}{\langle x,x^{*}\rangle }-f^{*}\left(x^{*}\right)}
and
f
∗
′
(
x
∗
)
=
x
(
x
∗
)
:=
arg
sup
x
⟨
x
,
x
∗
⟩
−
f
(
x
)
;
{\displaystyle f^{{*}\prime }\left(x^{*}\right)=x\left(x^{*}\right):=\arg \sup _{x}{\langle x,x^{*}\rangle }-f(x );}
hence
x
=
∇
f
∗
(
∇
f
(
x
)
)
,
{\displaystyle x=\nabla f^{*}\left(\nabla f(x )\right),}
x
∗
=
∇
f
(
∇
f
∗
(
x
∗
)
)
,
{\displaystyle x^{*}=\nabla f\left(\nabla f^{*}\left(x^{*}\right)\right),}
and moreover
f
′
′
(
x
)
⋅
f
∗
′
′
(
x
∗
(
x
)
)
=
1
,
{\displaystyle f^{\prime \prime }(x )\cdot f^{{*}\prime \prime }\left(x^{*}(x )\right)=1,}
f
∗
′
′
(
x
∗
)
⋅
f
′
′
(
x
(
x
∗
)
)
=
1.
{\displaystyle f^{{*}\prime \prime }\left(x^{*}\right)\cdot f^{\prime \prime }\left(x(x^{*})\right)=1.}
Scaling properties
edit
If for some
γ
>
0
,
{\displaystyle \gamma >0,}
g
(
x
)
=
α
+
β
x
+
γ
⋅
f
(
λ
x
+
δ
)
{\displaystyle g(x )=\alpha +\beta x+\gamma \cdot f\left(\lambda x+\delta \right)}
, then
g
∗
(
x
∗
)
=
−
α
−
δ
x
∗
−
β
λ
+
γ
⋅
f
∗
(
x
∗
−
β
λ
γ
)
.
{\displaystyle g^{*}\left(x^{*}\right)=-\alpha -\delta {\frac {x^{*}-\beta }{\lambda }}+\gamma \cdot f^{*}\left({\frac {x^{*}-\beta }{\lambda \gamma }}\right).}
edit
Let
A
:
X
→
Y
{\displaystyle A:X\to Y}
be a bounded linear operator . For any convex function
f
{\displaystyle f}
on
X
,
{\displaystyle X,}
(
A
f
)
∗
=
f
∗
A
∗
{\displaystyle \left(Af\right)^{*}=f^{*}A^{*}}
where
(
A
f
)
(
y
)
=
inf
{
f
(
x
)
:
x
∈
X
,
A
x
=
y
}
{\displaystyle (Af )(y )=\inf\{f(x ):x\in X,Ax=y\}}
is the preimage of
f
{\displaystyle f}
with respect to
A
{\displaystyle A}
and
A
∗
{\displaystyle A^{*}}
is the adjoint operator of
A
.
{\displaystyle A.}
[4 ]
A closed convex function
f
{\displaystyle f}
is symmetric with respect to a given set
G
{\displaystyle G}
of orthogonal linear transformations ,
f
(
A
x
)
=
f
(
x
)
{\displaystyle f(Ax )=f(x )}
for all
x
{\displaystyle x}
and all
A
∈
G
{\displaystyle A\in G}
if and only if its convex conjugate
f
∗
{\displaystyle f^{*}}
is symmetric with respect to
G
.
{\displaystyle G.}