Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Formal statement  



1.1  Alternative version  







2 History  





3 Inverse problem  





4 References  





5 Further reading  














Pinsker's inequality






Tiếng Vit
 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


Ininformation theory, Pinsker's inequality, named after its inventor Mark Semenovich Pinsker, is an inequality that bounds the total variation distance (or statistical distance) in terms of the Kullback–Leibler divergence. The inequality is tight up to constant factors.[1]

Formal statement[edit]

Pinsker's inequality states that, if and are two probability distributions on a measurable space , then

where

is the total variation distance (or statistical distance) between and and

is the Kullback–Leibler divergenceinnats. When the sample space is a finite set, the Kullback–Leibler divergence is given by

Note that in terms of the total variation norm of the signed measure , Pinsker's inequality differs from the one given above by a factor of two:

A proof of Pinsker's inequality uses the partition inequality for f-divergences.

Alternative version[edit]

Note that the expression of Pinsker inequality depends on what basis of logarithm is used in the definition of KL-divergence. is defined using (logarithm in base ), whereas is typically defined with (logarithm in base 2). Then,

Given the above comments, there is an alternative statement of Pinsker's inequality in some literature that relates information divergence to variation distance:

i.e.,

in which

is the (non-normalized) variation distance between two probability density functions and on the same alphabet .[2]

This form of Pinsker's inequality shows that "convergence in divergence" is a stronger notion than "convergence in variation distance".

A simple proof by John Pollard is shown by letting :

Here Titu's lemma is also known as Sedrakyan's inequality.

Note that the lower bound from Pinsker's inequality is vacuous for any distributions where , since the total variation distance is at most . For such distributions, an alternative bound can be used, due to Bretagnolle and Huber[3] (see, also, Tsybakov[4]):

History[edit]

Pinsker first proved the inequality with a greater constant. The inequality in the above form was proved independently by Kullback, Csiszár, and Kemperman.[5]

Inverse problem[edit]

A precise inverse of the inequality cannot hold: for every , there are distributions with but . An easy example is given by the two-point space with and .[6]

However, an inverse inequality holds on finite spaces with a constant depending on .[7] More specifically, it can be shown that with the definition we have for any measure which is absolutely continuous to

As a consequence, if has full support (i.e. for all ), then

References[edit]

  1. ^ Csiszár, Imre; Körner, János (2011). Information Theory: Coding Theorems for Discrete Memoryless Systems. Cambridge University Press. p. 44. ISBN 9781139499989.
  • ^ Raymond W., Yeung (2008). Information Theory and Network Coding. Hong Kong: Springer. p. 26. ISBN 978-0-387-79233-0.
  • ^ Bretagnolle, J.; Huber, C, Estimation des densités: risque minimax, Séminaire de Probabilités, XII (Univ. Strasbourg, Strasbourg, 1976/1977), pp. 342–363, Lecture Notes in Math., 649, Springer, Berlin, 1978, Lemma 2.1 (French).
  • ^ Tsybakov, Alexandre B., Introduction to nonparametric estimation, Revised and extended from the 2004 French original. Translated by Vladimir Zaiats. Springer Series in Statistics. Springer, New York, 2009. xii+214 pp. ISBN 978-0-387-79051-0, Equation 2.25.
  • ^ Tsybakov, Alexandre (2009). Introduction to Nonparametric Estimation. Springer. p. 132. ISBN 9780387790527.
  • ^ The divergence becomes infinite whenever one of the two distributions assigns probability zero to an event while the other assigns it a nonzero probability (no matter how small); see e.g. Basu, Mitra; Ho, Tin Kam (2006). Data Complexity in Pattern Recognition. Springer. p. 161. ISBN 9781846281723..
  • ^ see Lemma 4.1 in Götze, Friedrich; Sambale, Holger; Sinulis, Arthur (2019). "Higher order concentration for functions of weakly dependent random variables". Electronic Journal of Probability. 24. arXiv:1801.06348. doi:10.1214/19-EJP338. S2CID 52200238.
  • Further reading[edit]


    Retrieved from "https://en.wikipedia.org/w/index.php?title=Pinsker%27s_inequality&oldid=1214388793"

    Categories: 
    Information theory
    Probabilistic inequalities
     



    This page was last edited on 18 March 2024, at 17:26 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki