J u m p t o c o n t e n t
M a i n m e n u
M a i n m e n u
N a v i g a t i o n
● M a i n p a g e
● C o n t e n t s
● C u r r e n t e v e n t s
● R a n d o m a r t i c l e
● A b o u t W i k i p e d i a
● C o n t a c t u s
● D o n a t e
C o n t r i b u t e
● H e l p
● L e a r n t o e d i t
● C o m m u n i t y p o r t a l
● R e c e n t c h a n g e s
● U p l o a d f i l e
S e a r c h
Search
A p p e a r a n c e
● C r e a t e a c c o u n t
● L o g i n
P e r s o n a l t o o l s
● C r e a t e a c c o u n t
● L o g i n
P a g e s f o r l o g g e d o u t e d i t o r s l e a r n m o r e
● C o n t r i b u t i o n s
● T a l k
( T o p )
1
D e f i n i t i o n
2
E m p i r i c a l d i s t r i b u t i o n f u n c t i o n
3
S e e a l s o
4
R e f e r e n c e s
5
F u r t h e r r e a d i n g
T o g g l e t h e t a b l e o f c o n t e n t s
E m p i r i c a l m e a s u r e
2 l a n g u a g e s
● D e u t s c h
● F r a n ç a i s
E d i t l i n k s
● A r t i c l e
● T a l k
E n g l i s h
● R e a d
● E d i t
● V i e w h i s t o r y
T o o l s
T o o l s
A c t i o n s
● R e a d
● E d i t
● V i e w h i s t o r y
G e n e r a l
● W h a t l i n k s h e r e
● R e l a t e d c h a n g e s
● U p l o a d f i l e
● S p e c i a l p a g e s
● P e r m a n e n t l i n k
● P a g e i n f o r m a t i o n
● C i t e t h i s p a g e
● G e t s h o r t e n e d U R L
● D o w n l o a d Q R c o d e
● W i k i d a t a i t e m
P r i n t / e x p o r t
● D o w n l o a d a s P D F
● P r i n t a b l e v e r s i o n
A p p e a r a n c e
F r o m W i k i p e d i a , t h e f r e e e n c y c l o p e d i a
The motivation for studying empirical measures is that it is often impossible to know the true underlying probability measure
P
{\displaystyle P}
. We collect observations
X
1
,
X
2
,
…
,
X
n
{\displaystyle X_{1},X_{2},\dots ,X_{n}}
and compute relative frequencies . We can estimate
P
{\displaystyle P}
, or a related distribution function
F
{\displaystyle F}
by means of the empirical measure or empirical distribution function, respectively. These are uniformly good estimates under certain conditions. Theorems in the area of empirical processes provide rates of this convergence.
Definition [ edit ]
Let
X
1
,
X
2
,
…
{\displaystyle X_{1},X_{2},\dots }
be a sequence of independent identically distributed random variables with values in the state space S with probability distribution P .
Definition
The empirical measure P n is defined for measurable subsets of S and given by
P
n
(
A
)
=
1
n
∑
i
=
1
n
I
A
(
X
i
)
=
1
n
∑
i
=
1
n
δ
X
i
(
A
)
{\displaystyle P_{n}(A )={1 \over n}\sum _{i=1}^{n}I_{A}(X_{i})={\frac {1}{n}}\sum _{i=1}^{n}\delta _{X_{i}}(A )}
where
I
A
{\displaystyle I_{A}}
is the indicator function and
δ
X
{\displaystyle \delta _{X}}
is the Dirac measure .
Properties
For a fixed measurable set A , nP n (A ) is a binomial random variable with mean nP (A ) and variance nP (A )(1 − P (A )).
For a fixed partition
A
i
{\displaystyle A_{i}}
of S , random variables
Y
i
=
n
P
n
(
A
i
)
{\displaystyle Y_{i}=nP_{n}(A_{i})}
form a multinomial distribution with event probabilities
P
(
A
i
)
{\displaystyle P(A_{i})}
The covariance matrix of this multinomial distribution is
C
o
v
(
Y
i
,
Y
j
)
=
n
P
(
A
i
)
(
δ
i
j
−
P
(
A
j
)
)
{\displaystyle Cov(Y_{i},Y_{j})=nP(A_{i})(\delta _{ij}-P(A_{j}))}
.
Definition
(
P
n
(
c
)
)
c
∈
C
{\displaystyle {\bigl (}P_{n}(c ){\bigr )}_{c\in {\mathcal {C}}}}
is the empirical measure indexed by
C
{\displaystyle {\mathcal {C}}}
, a collection of measurable subsets of S .
To generalize this notion further, observe that the empirical measure
P
n
{\displaystyle P_{n}}
maps measurable functions
f
:
S
→
R
{\displaystyle f:S\to \mathbb {R} }
to their empirical mean ,
f
↦
P
n
f
=
∫
S
f
d
P
n
=
1
n
∑
i
=
1
n
f
(
X
i
)
{\displaystyle f\mapsto P_{n}f=\int _{S}f\,dP_{n}={\frac {1}{n}}\sum _{i=1}^{n}f(X_{i})}
In particular, the empirical measure of A is simply the empirical mean of the indicator function, P n (A ) = P n I A .
For a fixed measurable function
f
{\displaystyle f}
,
P
n
f
{\displaystyle P_{n}f}
is a random variable with mean
E
f
{\displaystyle \mathbb {E} f}
and variance
1
n
E
(
f
−
E
f
)
2
{\displaystyle {\frac {1}{n}}\mathbb {E} (f-\mathbb {E} f)^{2}}
.
By the strong law of large numbers , P n (A ) converges to P (A ) almost surely for fixed A . Similarly
P
n
f
{\displaystyle P_{n}f}
converges to
E
f
{\displaystyle \mathbb {E} f}
almost surely for a fixed measurable function
f
{\displaystyle f}
. The problem of uniform convergence of P n to P was open until Vapnik and Chervonenkis solved it in 1968.[1]
If the class
C
{\displaystyle {\mathcal {C}}}
(or
F
{\displaystyle {\mathcal {F}}}
) is Glivenko–Cantelli with respect to P then P n converges to P uniformly over
c
∈
C
{\displaystyle c\in {\mathcal {C}}}
(or
f
∈
F
{\displaystyle f\in {\mathcal {F}}}
). In other words, with probability 1 we have
‖
P
n
−
P
‖
C
=
sup
c
∈
C
|
P
n
(
c
)
−
P
(
c
)
|
→
0
,
{\displaystyle \|P_{n}-P\|_{\mathcal {C}}=\sup _{c\in {\mathcal {C}}}|P_{n}(c )-P(c )|\to 0,}
‖
P
n
−
P
‖
F
=
sup
f
∈
F
|
P
n
f
−
E
f
|
→
0.
{\displaystyle \|P_{n}-P\|_{\mathcal {F}}=\sup _{f\in {\mathcal {F}}}|P_{n}f-\mathbb {E} f|\to 0.}
Empirical distribution function [ edit ]
The empirical distribution function provides an example of empirical measures. For real-valued iid random variables
X
1
,
…
,
X
n
{\displaystyle X_{1},\dots ,X_{n}}
it is given by
F
n
(
x
)
=
P
n
(
(
−
∞
,
x
]
)
=
P
n
I
(
−
∞
,
x
]
.
{\displaystyle F_{n}(x )=P_{n}((-\infty ,x])=P_{n}I_{(-\infty ,x]}.}
In this case, empirical measures are indexed by a class
C
=
{
(
−
∞
,
x
]
:
x
∈
R
}
.
{\displaystyle {\mathcal {C}}=\{(-\infty ,x]:x\in \mathbb {R} \}.}
It has been shown that
C
{\displaystyle {\mathcal {C}}}
is a uniform Glivenko–Cantelli class , in particular,
sup
F
‖
F
n
(
x
)
−
F
(
x
)
‖
∞
→
0
{\displaystyle \sup _{F}\|F_{n}(x )-F(x )\|_{\infty }\to 0}
with probability 1.
See also [ edit ]
References [ edit ]
^ Vapnik, V.; Chervonenkis, A (1968). "Uniform convergence of frequencies of occurrence of events to their probabilities". Dokl. Akad. Nauk SSSR . 181 .
Further reading [ edit ]
R e t r i e v e d f r o m " https://en.wikipedia.org/w/index.php?title=Empirical_measure&oldid=1204999024 "
C a t e g o r i e s :
● M e a s u r e s ( m e a s u r e t h e o r y )
● E m p i r i c a l p r o c e s s
H i d d e n c a t e g o r i e s :
● A r t i c l e s l a c k i n g i n - t e x t c i t a t i o n s f r o m M a r c h 2 0 1 1
● A l l a r t i c l e s l a c k i n g i n - t e x t c i t a t i o n s
● T h i s p a g e w a s l a s t e d i t e d o n 8 F e b r u a r y 2 0 2 4 , a t 1 5 : 5 6 ( U T C ) .
● T e x t i s a v a i l a b l e u n d e r t h e C r e a t i v e C o m m o n s A t t r i b u t i o n - S h a r e A l i k e L i c e n s e 4 . 0 ;
a d d i t i o n a l t e r m s m a y a p p l y . B y u s i n g t h i s s i t e , y o u a g r e e t o t h e T e r m s o f U s e a n d P r i v a c y P o l i c y . W i k i p e d i a ® i s a r e g i s t e r e d t r a d e m a r k o f t h e W i k i m e d i a F o u n d a t i o n , I n c . , a n o n - p r o f i t o r g a n i z a t i o n .
● P r i v a c y p o l i c y
● A b o u t W i k i p e d i a
● D i s c l a i m e r s
● C o n t a c t W i k i p e d i a
● C o d e o f C o n d u c t
● D e v e l o p e r s
● S t a t i s t i c s
● C o o k i e s t a t e m e n t
● M o b i l e v i e w