Jump to content

Main menu Navigation ●Main page ●Contents ●Current events ●Random article ●About Wikipedia ●Contact us ●Donate Contribute ●Help ●Learn to edit ●Community portal ●Recent changes ●Upload file

●Create account ●Log in ●Create account ● Log in Pages for logged out editors learn more ●Contributions ●Talk

(Top) 1 Definition 2 See also 3 References

Parabolic fractal distribution

●Català ●Español ●Français Edit links ●Article ●Talk ●Read ●Edit ●View history Tools Actions ●Read ●Edit ●View history General ●What links here ●Related changes ●Upload file ●Special pages ●Permanent link ●Page information ●Cite this page ●Get shortened URL ●Download QR code ●Wikidata item Print/export ●Download as PDF ●Printable version Appearance From Wikipedia, the free encyclopedia

This article includes a list of references, related reading, or external links, but its sources remain unclear because it lacks inline citations. Please help improve this article by introducing more precise citations. (January 2023) (Learn how and when to remove this message)

Inprobability and statistics, the parabolic fractal distribution is a type of discrete probability distribution in which the logarithm of the frequency or size of entities in a population is a quadratic polynomial of the logarithm of the rank (with the largest example having rank 1). This can markedly improve the fit over a simple power-law relationship (see references below).

In the Laherrère/Deheuvels paper below, examples include galaxy sizes (ordered by luminosity), towns (in the USA, France, and world), spoken languages (by number of speakers) in the world, and oil fields in the world (by size). They also mention utility for this distribution in fitting seismic events (no example). The authors assert the advantage of this distribution is that it can be fitted using the largest known examples of the population being modeled, which are often readily available and complete, then the fitted parameters found can be used to compute the size of the entire population. So, for example, the populations of the hundred largest cities on the planet can be sorted and fitted, and the parameters found used to extrapolate to the smallest villages, to estimate the population of the planet. Another example is estimating total world oil reserves using the largest fields.

In a number of applications, there is a so-called King effect where the top-ranked item(s) have a significantly greater frequency or size than the model predicts on the basis of the other items. The Laherrère/Deheuvels paper shows the example of Paris, when sorting the sizes of towns in France. When the paper was written Paris was the largest city with about ten million inhabitants, but the next largest town had only about 1.5 million. Towns in France excluding Paris closely follow a parabolic distribution, well enough that the 56 largest gave a very good estimate of the population of the country. But that distribution would predict the largest city to have about two million inhabitants, not 10 million. The King Effect is named after the notion that a King must defeat all rivals for the throne and takes their wealth, estates and power, thereby creating a buffer between himself and the next-richest of his subjects. That specific effect (intentionally created) may apply to corporate sizes, where the largest businesses use their wealth to buy up smaller rivals. Absent intent, the King Effect may occur as a result of some persistent growth advantage due to scale, or to some unique advantage. Larger cities are more efficient connectors of people, talent and other resources. Unique advantages might include being a port city, or a Capital city where law is made, or a center of activity where physical proximity increases opportunity and creates a feedback loop. An example is the motion picture industry; where actors, writers and other workers move to where the most studios are, and new studios are founded in the same place because that is where the most talent resides.

To test for the King Effect, the distribution must be fitted excluding the 'k' top-ranked items, but without assigning new rank numbers to the remaining members of the population. For example, in France the ranks are (as of 2010):

Paris, 12.09M
Lyon, 2.12M
Marseille, 1.72M
Toulouse, 1.20M
Lille, 1.15M

A fitting algorithm would process pairs {(1,12.09), (2,2.12), (3,1.72), (4,1.20), (5,1.15)} and find the parameters for the best parabolic fit through those points. To test for the King Effect we just exclude the first pair (or first 'k' pairs), and find parabolic parameters that fit the remainder of the points. So for France we would fit the four points {(2,2.12), (3,1.72), (4,1.20), (5,1.15)}. Then we can use those parameters to estimate the size of cities ranked [1,k] and determine if they are King Effect members or normal members.

By comparison, Zipf's law fits a line through the points (also using the log of the rank and log of the value). A parabola (with one more parameter) will fit better, but far from the vertex the parabola is also nearly linear. Thus, although it is a judgment call for the statistician, if the fitted parameters put the vertex far from the points fitted, or if the parabolic curve is not a significantly better fit than a line, those may be symptomatic of overfitting (aka over-parameterization). The line (with two parameters instead of three) is probably the better generalization. More parameters always fit better, but at the cost of adding unexplained parameters or unwarranted assumptions (such as the assumption that a slight parabolic curve is a more appropriate model than a line).

Alternatively, it is possible to force the fitted parabola to have its vertex at the rank 1 position. In that case, it is not certain the parabola will fit better (have less error) than a straight line; and the choice might be made between the two based on which has the least error.

Definition[edit]

The probability mass function is given, as a function of the rank n, by

f(n;b,c)\propto n^{-b}\exp(-c(\log n)^{2}),

where b and c are parameters of the distribution.

References[edit]

Laherrère J, Deheuvels P (1996) "Distributions de type 'fractal parabolique' dans la nature" (French, with English summary: "Parabolic fractal" distributions in nature), (http://www.hubbertpeak.com/laherrere/fractal.htm) Comptes rendus de l'Académie des sciences, Série II a: Sciences de la Terre et des Planétes, 322, (7), 535–541
Xie, S.; Yang, Y. G.; Bao, Z. Y.; Ke, X. Z.; Liu, X. L. (2009). "Mineral resource analysis by parabolic fractals". Mining Science and Technology (China). 19: 91–96. doi:10.1016/S1674-5264(09)60017-X.

Probability distributions (list)

Discrete
univariate

with finite
support

with infinite
support

Continuous
univariate

supported on a bounded interval	arcsine ARGUS Balding–Nichols Bates beta beta rectangular continuous Bernoulli Irwin–Hall Kumaraswamy logit-normal noncentral beta PERT raised cosine reciprocal triangular U-quadratic uniform Wigner semicircle
supported on a semi-infinite interval	Benini Benktander 1st kind Benktander 2nd kind beta prime Burr chi chi-squared noncentral inverse scaled Dagum Davis Erlang hyper exponential hyperexponential hypoexponential logarithmic F noncentral folded normal Fréchet gamma generalized inverse gamma/Gompertz Gompertz shifted half-logistic half-normal Hotelling's T-squared inverse Gaussian generalized Kolmogorov Lévy log-Cauchy log-Laplace log-logistic log-normal log-t Lomax matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami Pareto phase-type Poly-Weibull Rayleigh relativistic Breit–Wigner Rice truncated normal type-2 Gumbel Weibull discrete Wilks's lambda
supported on the whole real line	Cauchy exponential power Fisher's z Kaniadakis κ-Gaussian Gaussian q generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson's S_U Landau Laplace asymmetric logistic noncentral t normal (Gaussian) normal-inverse Gaussian skew normal slash stable Student's t Tracy–Widom variance-gamma Voigt
with support whose type varies	generalized chi-squared generalized extreme value generalized Pareto Marchenko–Pastur Kaniadakis κ-exponential Kaniadakis κ-Gamma Kaniadakis κ-Weibull Kaniadakis κ-Logistic Kaniadakis κ-Erlang q-exponential q-Gaussian q-Weibull shifted log-logistic Tukey lambda

Mixed
univariate

continuous-
discrete

Rectified Gaussian

Multivariate
(joint)

Discrete:
Ewens
multinomial
- Dirichlet
- negative
Continuous:
Dirichlet
- generalized
multivariate Laplace
multivariate normal
multivariate stable
multivariate t
normal-gamma
- inverse
Matrix-valued:
LKJ
matrix normal
matrix t
matrix gamma
- inverse
Wishart
- normal
- inverse
- normal-inverse
- complex

Directional

Univariate (circular) directional: Circular uniform; univariate von Mises; wrapped normal; wrapped Cauchy; wrapped exponential; wrapped asymmetric Laplace; wrapped Lévy
Bivariate (spherical): Kent
Bivariate (toroidal): bivariate von Mises
Multivariate: von Mises–Fisher; Bingham

Degenerate
and singular

Degenerate: Dirac delta function
Singular: Cantor

Families

Retrieved from "https://en.wikipedia.org/w/index.php?title=Parabolic_fractal_distribution&oldid=1131726769" Category: ●Discrete distributions Hidden categories: ●Articles lacking in-text citations from January 2023 ●All articles lacking in-text citations ●This page was last edited on 5 January 2023, at 13:11 (UTC). ●Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization. ●Privacy policy ●About Wikipedia ●Disclaimers ●Contact Wikipedia ●Code of Conduct ●Developers ●Statistics ●Cookie statement ●Mobile view