Jump to content

Main menu Navigation ●Main page ●Contents ●Current events ●Random article ●About Wikipedia ●Contact us ●Donate Contribute ●Help ●Learn to edit ●Community portal ●Recent changes ●Upload file

●Create account ●Log in ●Create account ● Log in Pages for logged out editors learn more ●Contributions ●Talk

(Top) 1 Definition 2 Characterization 3 Special cases 4 Generating generalized Pareto random variables 4.1 Generating GPD random variables 4.2 GPD as an Exponential-Gamma Mixture 5 Exponentiated generalized Pareto distribution 5.1 The exponentiated generalized Pareto distribution (exGPD) 6 The Hill's estimator 7 See also 8 References 9 Further reading 10 External links

Generalized Pareto distribution

●Català ●Українська Edit links ●Article ●Talk ●Read ●Edit ●View history Tools Actions ●Read ●Edit ●View history General ●What links here ●Related changes ●Upload file ●Special pages ●Permanent link ●Page information ●Cite this page ●Get shortened URL ●Download QR code ●Wikidata item Print/export ●Download as PDF ●Printable version Appearance From Wikipedia, the free encyclopedia

This article needs additional citations for verification. Please help improve this articlebyadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Generalized Pareto distribution" – news · newspapers · books · scholar · JSTOR (March 2012) (Learn how and when to remove this message)

Generalized Pareto distribution
Probability density function GPD distribution functions for $\mu =0$ and different values of $\sigma$ and $\xi$
Cumulative distribution function
Parameters	$\mu \in (-\infty ,\infty )\,$ location (real) $\sigma \in (0,\infty )\,$ scale (real) $\xi \in (-\infty ,\infty )\,$ shape (real)
Support	$x\geqslant \mu \,\;(\xi \geqslant 0)$ $\mu \leqslant x\leqslant \mu -\sigma /\xi \,\;(\xi <0)$
PDF	$(1+\xi z)^{-(1/\xi +1)}$ where $z={\frac {x-\mu }{\sigma }}$
CDF	$1-(1+\xi z)^{-1/\xi }\,$
Mean	${\displaystyle \mu +{\frac {\sigma }{1-\xi }}\,\;(\xi <$
Median	$\mu +{\frac {\sigma (2^{\xi }-1)}{\xi }}$
Mode	$\mu$
Variance	${\frac {\sigma ^{2}}{(1-\xi )^{2}(1-2\xi )}}\,\;(\xi <1/2)$
Skewness	${\frac {2(1+\xi ){\sqrt {1-2\xi }}}{(1-3\xi )}}\,\;(\xi <1/3)$
Excess kurtosis	${\frac {3(1-2\xi )(2\xi ^{2}+\xi +3)}{(1-3\xi )(1-4\xi )}}-3\,\;(\xi <1/4)$
Entropy	$\log(\sigma )+\xi +1$
MGF	${\displaystyle e^{\theta \mu }\,\sum _{j=0}^{\infty }\left[{\frac {(\theta \sigma )^{j}}{\prod _{k=0}^{j}(1-k\xi )}}\right],\;(k\xi <$
CF	${\displaystyle e^{it\mu }\,\sum _{j=0}^{\infty }\left[{\frac {(it\sigma )^{j}}{\prod _{k=0}^{j}(1-k\xi )}}\right],\;(k\xi <$
Method of moments	${\displaystyle \xi ={\frac {1}{2}}\left(1-{\frac {(E[$ ${\displaystyle \sigma =(E[$
Expected shortfall	${\begin{cases}\mu +\sigma \left[{\frac {(1-p)^{-\xi }}{1-\xi }}+{\frac {(1-p)^{-\xi }-1}{\xi }}\right]&,\xi \neq 0\\\mu +\sigma [1-\ln(1-p)]&,\xi =0\end{cases}}$ ^[1]

Instatistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location $\mu$ , scale $\sigma$ , and shape $\xi$ .^[2]^[3] Sometimes it is specified by only scale and shape^[4] and sometimes only by its shape parameter. Some references give the shape parameter as $\kappa =-\xi \,$ .^[5]

Definition

[edit]

The standard cumulative distribution function (cdf) of the GPD is defined by^[6]

{\displaystyle F_{\xi }(

where the support is $z\geq 0$ for $\xi \geq 0$ and $0\leq z\leq -1/\xi$ for $\xi <0$ . The corresponding probability density function (pdf) is

{\displaystyle f_{\xi }(

Characterization

[edit]

The related location-scale family of distributions is obtained by replacing the argument zby ${\frac {x-\mu }{\sigma }}$ and adjusting the support accordingly.

The cumulative distribution functionof $X\sim GPD(\mu ,\sigma ,\xi )$ ( $\mu \in \mathbb {R}$ , $\sigma >0$ , and $\xi \in \mathbb {R}$ ) is

{\displaystyle F_{(\mu ,\sigma ,\xi )}(

where the support of $X$ is $x\geqslant \mu$ when $\xi \geqslant 0\,$ , and $\mu \leqslant x\leqslant \mu -\sigma /\xi$ when $\xi <0$ .

The probability density function (pdf) of $X\sim GPD(\mu ,\sigma ,\xi )$ is

{\displaystyle f_{(\mu ,\sigma ,\xi )}(

again, for $x\geqslant \mu$ when $\xi \geqslant 0$ , and $\mu \leqslant x\leqslant \mu -\sigma /\xi$ when $\xi <0$ .

The pdf is a solution of the following differential equation: ^{[citation needed]}

{\displaystyle \left\{{\begin{array}{l}f'(

Special cases

[edit]

If the shape $\xi$ and location $\mu$ are both zero, the GPD is equivalent to the exponential distribution.
With shape $\xi =-1$ , the GPD is equivalent to the continuous uniform distribution $U(0,\sigma )$ .^[7]
With shape $\xi >0$ and location $\mu =\sigma /\xi$ , the GPD is equivalent to the Pareto distribution with scale $x_{m}=\sigma /\xi$ and shape $\alpha =1/\xi$ .
If $X$ $\sim$ $GPD$ $($ $\mu =0$ , $\sigma$ , $\xi$ $)$ , then ${\displaystyle Y=\log($ [1]. (exGPD stands for the exponentiated generalized Pareto distribution.)
GPD is similar to the Burr distribution.

Generating generalized Pareto random variables

[edit]

Generating GPD random variables

[edit]

IfUisuniformly distributed on (0, 1], then

X=\mu +{\frac {\sigma (U^{-\xi }-1)}{\xi }}\sim GPD(\mu ,\sigma ,\xi \neq 0)

and

{\displaystyle X=\mu -\sigma \ln(

Both formulas are obtained by inversion of the cdf.

In Matlab Statistics Toolbox, you can easily use "gprnd" command to generate generalized Pareto random numbers.

GPD as an Exponential-Gamma Mixture

[edit]

A GPD random variable can also be expressed as an exponential random variable, with a Gamma distributed rate parameter.

X|\Lambda \sim \operatorname {Exp} (\Lambda )

and

\Lambda \sim \operatorname {Gamma} (\alpha ,\beta )

then

X\sim \operatorname {GPD} (\xi =1/\alpha ,\ \sigma =\beta /\alpha )

Notice however, that since the parameters for the Gamma distribution must be greater than zero, we obtain the additional restrictions that: $\xi$ must be positive.

In addition to this mixture (or compound) expression, the generalized Pareto distribution can also be expressed as a simple ratio. Concretely, for ${\displaystyle Y\sim {\text{Exponential}}($ and $Z\sim {\text{Gamma}}(1/\xi ,1)$ , we have $\mu +\sigma {\frac {Y}{\xi Z}}\sim {\text{GPD}}(\mu ,\sigma ,\xi )$ . This is a consequence of the mixture after setting $\beta =\alpha$ and taking into account that the rate parameters of the exponential and gamma distribution are simply inverse multiplicative constants.

Exponentiated generalized Pareto distribution

[edit]

The exponentiated generalized Pareto distribution (exGPD)

[edit]

The pdf of the

exGPD(\sigma ,\xi )

(exponentiated generalized Pareto distribution) for different values

\sigma

and

\xi

If $X\sim GPD$ $($ $\mu =0$ , $\sigma$ , $\xi$ $)$ , then ${\displaystyle Y=\log($ is distributed according to the exponentiated generalized Pareto distribution, denoted by $Y$ $\sim$ $exGPD$ $($ $\sigma$ , $\xi$ $)$ .

The probability density function(pdf) of $Y$ $\sim$ $exGPD$ $($ $\sigma$ , $\xi$ $)\,\,(\sigma >0)$ is

{\displaystyle g_{(\sigma ,\xi )}(

where the support is $-\infty <y<\infty$ for $\xi \geq 0$ , and $-\infty <y\leq \log(-\sigma /\xi )$ for $\xi <0$ .

For all $\xi$ , the $\log \sigma$ becomes the location parameter. See the right panel for the pdf when the shape $\xi$ is positive.

The exGPD has finite moments of all orders for all $\sigma >0$ and $-\infty <\xi <\infty$ .

The variance of the

exGPD(\sigma ,\xi )

as a function of

\xi

. Note that the variance only depends on

\xi

. The red dotted line represents the variance evaluated at

\xi =0

, that is,

{\displaystyle \psi '(

The moment-generating functionof $Y\sim exGPD(\sigma ,\xi )$ is

{\displaystyle M_{Y}(

where $B(a,b)$ and ${\displaystyle \Gamma ($ denote the beta function and gamma function, respectively.

The expected valueof $Y$ $\sim$ $exGPD$ $($ $\sigma$ , $\xi$ $)$ depends on the scale $\sigma$ and shape $\xi$ parameters, while the $\xi$ participates through the digamma function:

{\displaystyle E[

Note that for a fixed value for the $\xi \in (-\infty ,\infty )$ , the $\log \ \sigma$ plays as the location parameter under the exponentiated generalized Pareto distribution.

The varianceof $Y$ $\sim$ $exGPD$ $($ $\sigma$ , $\xi$ $)$ depends on the shape parameter $\xi$ only through the polygamma function of order 1 (also called the trigamma function):

{\displaystyle Var[

See the right panel for the variance as a function of $\xi$ . Note that ${\displaystyle \psi '($ .

Note that the roles of the scale parameter $\sigma$ and the shape parameter $\xi$ under $Y\sim exGPD(\sigma ,\xi )$ are separably interpretable, which may lead to a robust efficient estimation for the $\xi$ than using the $X\sim GPD(\sigma ,\xi )$ [2]. The roles of the two parameters are associated each other under $X\sim GPD(\mu =0,\sigma ,\xi )$ (at least up to the second central moment); see the formula of variance ${\displaystyle Var($ wherein both parameters are participated.

The Hill's estimator

[edit]

Assume that $X_{1:n}=(X_{1},\cdots ,X_{n})$ are $n$ observations (not need to be i.i.d.) from an unknown heavy-tailed distribution $F$ such that its tail distribution is regularly varying with the tail-index $1/\xi$ (hence, the corresponding shape parameter is $\xi$ ). To be specific, the tail distribution is described as

{\displaystyle {\bar {F}}(

It is of a particular interest in the extreme value theory to estimate the shape parameter $\xi$ , especially when $\xi$ is positive (so called the heavy-tailed distribution).

Let $F_{u}$ be their conditional excess distribution function. Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974) states that for a large class of underlying distribution functions $F$ , and large $u$ , $F_{u}$ is well approximated by the generalized Pareto distribution (GPD), which motivated Peak Over Threshold (POT) methods to estimate $\xi$ : the GPD plays the key role in POT approach.

A renowned estimator using the POT methodology is the Hill's estimator. Technical formulation of the Hill's estimator is as follows. For $1\leq i\leq n$ , write ${\displaystyle X_{($ for the $i$ -th largest value of $X_{1},\cdots ,X_{n}$ . Then, with this notation, the Hill's estimator (see page 190 of Reference 5 by Embrechts et al [3]) based on the $k$ upper order statistics is defined as

{\displaystyle {\widehat {\xi }}_{k}^{\text{Hill}}={\widehat {\xi }}_{k}^{\text{Hill}}(X_{1:n})={\frac {1}{k-1}}\sum _{j=1}^{k-1}\log {\bigg (}{\frac {X_{(

In practice, the Hill estimator is used as follows. First, calculate the estimator ${\widehat {\xi }}_{k}^{\text{Hill}}$ at each integer $k\in \{2,\cdots ,n\}$ , and then plot the ordered pairs $\{(k,{\widehat {\xi }}_{k}^{\text{Hill}})\}_{k=2}^{n}$ . Then, select from the set of Hill estimators $\{{\widehat {\xi }}_{k}^{\text{Hill}}\}_{k=2}^{n}$ which are roughly constant with respect to $k$ : these stable values are regarded as reasonable estimates for the shape parameter $\xi$ . If $X_{1},\cdots ,X_{n}$ are i.i.d., then the Hill's estimator is a consistent estimator for the shape parameter $\xi$ [4].

Note that the Hill estimator ${\widehat {\xi }}_{k}^{\text{Hill}}$ makes a use of the log-transformation for the observations $X_{1:n}=(X_{1},\cdots ,X_{n})$ . (The Pickand's estimator ${\widehat {\xi }}_{k}^{\text{Pickand}}$ also employed the log-transformation, but in a slightly different way [5].)

References

[edit]

^ ^a ^b Norton, Matthew; Khokhlov, Valentyn; Uryasev, Stan (2019). "Calculating CVaR and bPOE for common probability distributions with application to portfolio optimization and density estimation" (PDF). Annals of Operations Research. 299 (1–2). Springer: 1281–1315. arXiv:1811.11301. doi:10.1007/s10479-019-03373-1. S2CID 254231768. Archived from the original (PDF) on 2023-03-31. Retrieved 2023-02-27.

^ Coles, Stuart (2001-12-12). An Introduction to Statistical Modeling of Extreme Values. Springer. p. 75. ISBN 9781852334598.

^ Dargahi-Noubary, G. R. (1989). "On tail estimation: An improved method". Mathematical Geology. 21 (8): 829–842. Bibcode:1989MatGe..21..829D. doi:10.1007/BF00894450. S2CID 122710961.

^ Hosking, J. R. M.; Wallis, J. R. (1987). "Parameter and Quantile Estimation for the Generalized Pareto Distribution". Technometrics. 29 (3): 339–349. doi:10.2307/1269343. JSTOR 1269343.

^ Davison, A. C. (1984-09-30). "Modelling Excesses over High Thresholds, with an Application". In de Oliveira, J. Tiago (ed.). Statistical Extremes and Applications. Kluwer. p. 462. ISBN 9789027718044.

^ Embrechts, Paul; Klüppelberg, Claudia; Mikosch, Thomas (1997-01-01). Modelling extremal events for insurance and finance. Springer. p. 162. ISBN 9783540609315.

^ Castillo, Enrique, and Ali S. Hadi. "Fitting the generalized Pareto distribution to data." Journal of the American Statistical Association 92.440 (1997): 1609-1620.

External links

[edit]

Mathworks: Generalized Pareto distribution

Probability distributions (list)

Discrete
univariate

with finite
support

with infinite
support

Continuous
univariate

supported on a bounded interval	arcsine ARGUS Balding–Nichols Bates beta beta rectangular continuous Bernoulli Irwin–Hall Kumaraswamy logit-normal noncentral beta PERT raised cosine reciprocal triangular U-quadratic uniform Wigner semicircle
supported on a semi-infinite interval	Benini Benktander 1st kind Benktander 2nd kind beta prime Burr chi chi-squared noncentral inverse scaled Dagum Davis Erlang hyper exponential hyperexponential hypoexponential logarithmic F noncentral folded normal Fréchet gamma generalized inverse gamma/Gompertz Gompertz shifted half-logistic half-normal Hotelling's T-squared inverse Gaussian generalized Kolmogorov Lévy log-Cauchy log-Laplace log-logistic log-normal log-t Lomax matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami Pareto phase-type Poly-Weibull Rayleigh relativistic Breit–Wigner Rice truncated normal type-2 Gumbel Weibull discrete Wilks's lambda
supported on the whole real line	Cauchy exponential power Fisher's z Kaniadakis κ-Gaussian Gaussian q generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson's S_U Landau Laplace asymmetric logistic noncentral t normal (Gaussian) normal-inverse Gaussian skew normal slash stable Student's t Tracy–Widom variance-gamma Voigt
with support whose type varies	generalized chi-squared generalized extreme value generalized Pareto Marchenko–Pastur Kaniadakis κ-exponential Kaniadakis κ-Gamma Kaniadakis κ-Weibull Kaniadakis κ-Logistic Kaniadakis κ-Erlang q-exponential q-Gaussian q-Weibull shifted log-logistic Tukey lambda