Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Definitions  





2 Applications and purpose  





3 Non-parametric models  





4 Methods  





5 History  





6 See also  





7 Notes  





8 General references  














Nonparametric statistics






العربية
Català
Deutsch
Eesti
Español
Euskara
فارسی
Français

Bahasa Indonesia
Italiano
עברית
Latviešu
Nederlands

Polski
Português
Русский

Sunda
Türkçe


 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 




In other projects  



Wikimedia Commons
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 

(Redirected from Non-parametric)

Nonparametric statistics is a type of statistical analysis that makes minimal assumptions about the underlying distribution of the data being studied. Often these models are infinite-dimensional, rather than finite dimensional, as is parametric statistics.[1] Nonparametric statistics can be used for descriptive statisticsorstatistical inference. Nonparametric tests are often used when the assumptions of parametric tests are evidently violated.[2]

Definitions[edit]

The term "nonparametric statistics" has been defined imprecisely in the following two ways, among others:

  1. The first meaning of nonparametric involves techniques that do not rely on data belonging to any particular parametric family of probability distributions.

    These include, among others:

    • Methods which are distribution-free, which do not rely on assumptions that the data are drawn from a given parametric family of probability distributions.
    • Statistics defined to be a function on a sample, without dependency on a parameter.

    An example is Order statistics, which are based on ordinal ranking of observations.

    The discussion following is taken from Kendall's Advanced Theory of Statistics.[3]

    Statistical hypotheses concern the behavior of observable random variables.... For example, the hypothesis (a) that a normal distribution has a specified mean and variance is statistical; so is the hypothesis (b) that it has a given mean but unspecified variance; so is the hypothesis (c) that a distribution is of normal form with both mean and variance unspecified; finally, so is the hypothesis (d) that two unspecified continuous distributions are identical.

    It will have been noticed that in the examples (a) and (b) the distribution underlying the observations was taken to be of a certain form (the normal) and the hypothesis was concerned entirely with the value of one or both of its parameters. Such a hypothesis, for obvious reasons, is called parametric.

    Hypothesis (c) was of a different nature, as no parameter values are specified in the statement of the hypothesis; we might reasonably call such a hypothesis non-parametric. Hypothesis (d) is also non-parametric but, in addition, it does not even specify the underlying form of the distribution and may now be reasonably termed distribution-free. Notwithstanding these distinctions, the statistical literature now commonly applies the label "non-parametric" to test procedures that we have just termed "distribution-free", thereby losing a useful classification.

  2. The second meaning of non-parametric involves techniques that do not assume that the structure of a model is fixed. Typically, the model grows in size to accommodate the complexity of the data. In these techniques, individual variables are typically assumed to belong to parametric distributions, and assumptions about the types of associations among variables are also made. These techniques include, among others:
    • non-parametric regression, which is modeling whereby the structure of the relationship between variables is treated non-parametrically, but where nevertheless there may be parametric assumptions about the distribution of model residuals.
    • non-parametric hierarchical Bayesian models, such as models based on the Dirichlet process, which allow the number of latent variables to grow as necessary to fit the data, but where individual variables still follow parametric distributions and even the process controlling the rate of growth of latent variables follows a parametric distribution.

Applications and purpose[edit]

Non-parametric methods are widely used for studying populations that have a ranked order (such as movie reviews receiving one to four "stars"). The use of non-parametric methods may be necessary when data have a ranking but no clear numerical interpretation, such as when assessing preferences. In terms of levels of measurement, non-parametric methods result in ordinal data.

As non-parametric methods make fewer assumptions, their applicability is much more general than the corresponding parametric methods. In particular, they may be applied in situations where less is known about the application in question. Also, due to the reliance on fewer assumptions, non-parametric methods are more robust.

Non-parametric methods are sometimes considered simpler to use and more robust than parametric methods, even when the assumptions of parametric methods are justified. This is due to their more general nature, which may make them less susceptible to misuse and misunderstanding. Non-parametric methods can be considered a conservative choice, as they will work even when their assumptions are not met, whereas parametric methods can produce misleading results when their assumptions are violated.

The wider applicability and increased robustness of non-parametric tests comes at a cost: in cases where a parametric test's assumptions are met, non-parametric tests have less statistical power. In other words, a larger sample size can be required to draw conclusions with the same degree of confidence.

Non-parametric models[edit]

Non-parametric models differ from parametric models in that the model structure is not specified a priori but is instead determined from data. The term non-parametric is not meant to imply that such models completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance.

Methods[edit]

Non-parametric (ordistribution-free) inferential statistical methods are mathematical procedures for statistical hypothesis testing which, unlike parametric statistics, make no assumptions about the probability distributions of the variables being assessed. The most frequently used tests include

  • Anderson–Darling test: tests whether a sample is drawn from a given distribution
  • Statistical bootstrap methods: estimates the accuracy/sampling distribution of a statistic
  • Cochran's Q: tests whether k treatments in randomized block designs with 0/1 outcomes have identical effects
  • Cohen's kappa: measures inter-rater agreement for categorical items
  • Friedman two-way analysis of variance by ranks: tests whether k treatments in randomized block designs have identical effects
  • Empirical likelihood
  • Kaplan–Meier: estimates the survival function from lifetime data, modeling censoring
  • Kendall's tau: measures statistical dependence between two variables
  • Kendall's W: a measure between 0 and 1 of inter-rater agreement.
  • Kolmogorov–Smirnov test: tests whether a sample is drawn from a given distribution, or whether two samples are drawn from the same distribution.
  • Kruskal–Wallis one-way analysis of variance by ranks: tests whether > 2 independent samples are drawn from the same distribution.
  • Kuiper's test: tests whether a sample is drawn from a given distribution, sensitive to cyclic variations such as day of the week.
  • Logrank test: compares survival distributions of two right-skewed, censored samples.
  • Mann–Whitney U or Wilcoxon rank sum test: tests whether two samples are drawn from the same distribution, as compared to a given alternative hypothesis.
  • McNemar's test: tests whether, in 2 × 2 contingency tables with a dichotomous trait and matched pairs of subjects, row and column marginal frequencies are equal.
  • Median test: tests whether two samples are drawn from distributions with equal medians.
  • Pitman's permutation test: a statistical significance test that yields exact p values by examining all possible rearrangements of labels.
  • Rank products: detects differentially expressed genes in replicated microarray experiments.
  • Siegel–Tukey test: tests for differences in scale between two groups.
  • Sign test: tests whether matched pair samples are drawn from distributions with equal medians.
  • Spearman's rank correlation coefficient: measures statistical dependence between two variables using a monotonic function.
  • Squared ranks test: tests equality of variances in two or more samples.
  • Tukey–Duckworth test: tests equality of two distributions by using ranks.
  • Wald–Wolfowitz runs test: tests whether the elements of a sequence are mutually independent/random.
  • Wilcoxon signed-rank test: tests whether matched pair samples are drawn from populations with different mean ranks.
  • History[edit]

    Early nonparametric statistics include the median (13th century or earlier, use in estimation by Edward Wright, 1599; see Median § History) and the sign testbyJohn Arbuthnot (1710) in analyzing the human sex ratio at birth (see Sign test § History).[4][5]

    See also[edit]

    Notes[edit]

    1. ^ "All of Nonparametric Statistics". Springer Texts in Statistics. 2006. doi:10.1007/0-387-30623-4.
  • ^ Pearce, J; Derrick, B (2019). "Preliminary testing: The devil of statistics?". Reinvention: An International Journal of Undergraduate Research. 12 (2). doi:10.31273/reinvention.v12i2.339.
  • ^ Stuart A., Ord J.K, Arnold S. (1999), Kendall's Advanced Theory of Statistics: Volume 2A—Classical Inference and the Linear Model, sixth edition, §20.2–20.3 (Arnold).
  • ^ Conover, W.J. (1999), "Chapter 3.4: The Sign Test", Practical Nonparametric Statistics (Third ed.), Wiley, pp. 157–176, ISBN 0-471-16068-7
  • ^ Sprent, P. (1989), Applied Nonparametric Statistical Methods (Second ed.), Chapman & Hall, ISBN 0-412-44980-3
  • General references[edit]


    Retrieved from "https://en.wikipedia.org/w/index.php?title=Nonparametric_statistics&oldid=1216518981"

    Categories: 
    Nonparametric statistics
    Statistical inference
    Robust statistics
    Mathematical and quantitative methods (economics)
    Hidden categories: 
    Articles with short description
    Short description matches Wikidata
    Articles with BNF identifiers
    Articles with BNFdata identifiers
    Articles with J9U identifiers
    Articles with LCCN identifiers
     



    This page was last edited on 31 March 2024, at 13:32 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki