Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Fit of distributions  





2 Regression analysis  





3 Categorical data  



3.1  Pearson's chi-square test  



3.1.1  Binomial case  







3.2  G-test  







4 See also  





5 References  





6 Further reading  














Goodness of fit






العربية
Čeština
Deutsch
Español
Euskara
فارسی
Français
Português
Русский

Türkçe
Українська


 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e.g. to test for normalityofresiduals, to test whether two samples are drawn from identical distributions (see Kolmogorov–Smirnov test), or whether outcome frequencies follow a specified distribution (see Pearson's chi-square test). In the analysis of variance, one of the components into which the variance is partitioned may be a lack-of-fit sum of squares.

Fit of distributions[edit]

In assessing whether a given distribution is suited to a data-set, the following tests and their underlying measures of fit can be used:

Regression analysis[edit]

Inregression analysis, more specifically regression validation, the following topics relate to goodness of fit:

Categorical data[edit]

The following are examples that arise in the context of categorical data.

Pearson's chi-square test[edit]

Pearson's chi-square test uses a measure of goodness of fit which is the sum of differences between observed and expected outcome frequencies (that is, counts of observations), each squared and divided by the expectation:

where:

The expected frequency is calculated by:

where:

The resulting value can be compared with a chi-square distribution to determine the goodness of fit. The chi-square distribution has (kc) degrees of freedom, where k is the number of non-empty cells and c is the number of estimated parameters (including location and scale parameters and shape parameters) for the distribution plus one. For example, for a 3-parameter Weibull distribution, c = 4.

Binomial case[edit]

A binomial experiment is a sequence of independent trials in which the trials can result in one of two outcomes, success or failure. There are n trials each with probability of success, denoted by p. Provided that npi ≫ 1 for every i (where i = 1, 2, ..., k), then

This has approximately a chi-square distribution with k − 1 degrees of freedom. The fact that there are k − 1 degrees of freedom is a consequence of the restriction . We know there are k observed cell counts, however, once any k − 1 are known, the remaining one is uniquely determined. Basically, one can say, there are only k − 1 freely determined cell counts, thus k − 1 degrees of freedom.

G-test[edit]

G-tests are likelihood-ratio tests of statistical significance that are increasingly being used in situations where Pearson's chi-square tests were previously recommended.[7]

The general formula for Gis

where and are the same as for the chi-square test, denotes the natural logarithm, and the sum is taken over all non-empty cells. Furthermore, the total observed count should be equal to the total expected count:

where is the total number of observations.

G-tests have been recommended at least since the 1981 edition of the popular statistics textbook by Robert R. Sokal and F. James Rohlf.[8]

See also[edit]

References[edit]

  1. ^ Berk, Robert H.; Jones, Douglas H. (1979). "Goodness-of-fit test statistics that dominate the Kolmogorov statistics". Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete. 47 (1): 47–59. doi:10.1007/BF00533250.
  • ^ Moscovich, Amit; Nadler, Boaz; Spiegelman, Clifford (2016). "On the exact Berk-Jones statistics and their p-value calculation". Electronic Journal of Statistics. 10 (2). arXiv:1311.3190. doi:10.1214/16-EJS1172.
  • ^ Liu, Qiang; Lee, Jason; Jordan, Michael (20 June 2016). "A Kernelized Stein Discrepancy for Goodness-of-fit Tests". Proceedings of the 33rd International Conference on Machine Learning. The 33rd International Conference on Machine Learning. New York, New York, USA: Proceedings of Machine Learning Research. pp. 276–284.
  • ^ Chwialkowski, Kacper; Strathmann, Heiko; Gretton, Arthur (20 June 2016). "A Kernel Test of Goodness of Fit". Proceedings of the 33rd International Conference on Machine Learning. The 33rd International Conference on Machine Learning. New York, New York, USA: Proceedings of Machine Learning Research. pp. 2606–2615.
  • ^ Zhang, Jin (2002). "Powerful goodness-of-fit tests based on the likelihood ratio" (PDF). J. R. Stat. Soc. B. 64 (2): 281–294. doi:10.1111/1467-9868.00337. Retrieved 5 November 2018.
  • ^ Vexler, Albert; Gurevich, Gregory (2010). "Empirical Likelihood Ratios Applied to Goodness-of-Fit Tests Based on Sample Entropy". Computational Statistics and Data Analysis. 54 (2): 531–545. doi:10.1016/j.csda.2009.09.025.
  • ^ McDonald, J.H. (2014). "G–test of goodness-of-fit". Handbook of Biological Statistics (Third ed.). Baltimore, Maryland: Sparky House Publishing. pp. 53–58.
  • ^ Sokal, R. R.; Rohlf, F. J. (1981). Biometry: The Principles and Practice of Statistics in Biological Research (Second ed.). W. H. Freeman. ISBN 0-7167-2411-1.
  • Further reading[edit]


    Retrieved from "https://en.wikipedia.org/w/index.php?title=Goodness_of_fit&oldid=1193603685"

    Category: 
    Statistical theory
    Hidden categories: 
    Articles with short description
    Short description is different from Wikidata
    Articles needing additional references from January 2018
    All articles needing additional references
     



    This page was last edited on 4 January 2024, at 17:35 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki