Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Origin  



1.1  Philosophical  





1.2  Principle  





1.3  Mathematical  







2 Mathematical guarantees  



2.1  Solomonoff's completeness  





2.2  Solomonoff's uncomputability  







3 Modern applications  



3.1  Artificial intelligence  







4 See also  





5 References  





6 Sources  





7 External links  














Solomonoff's theory of inductive inference






العربية
Português

 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 

(Redirected from Solomonoff induction)

Solomonoff's theory of inductive inference is a mathematical theory of induction introduced by Ray Solomonoff, based on probability theory and theoretical computer science.[1][2][3] In essence, Solomonoff's induction derives the posterior probability of any computable theory, given a sequence of observed data. This posterior probability is derived from Bayes' rule and some universal prior, that is, a prior that assigns a positive probability to any computable theory.

Solomonoff proved that this induction is incomputable, but noted that "this incomputability is of a very benign kind", and that it "in no way inhibits its use for practical prediction".[2]

Solomonoff's induction naturally formalizes Occam's razor[4][5][6][7][8] by assigning larger prior credences to theories that require a shorter algorithmic description.

Origin[edit]

Philosophical[edit]

The theory is based in philosophical foundations, and was founded by Ray Solomonoff around 1960.[9] It is a mathematically formalized combination of Occam's razor[4][5][6][7][8] and the Principle of Multiple Explanations.[10] All computable theories which perfectly describe previous observations are used to calculate the probability of the next observation, with more weight put on the shorter computable theories. Marcus Hutter's universal artificial intelligence builds upon this to calculate the expected value of an action.

Principle[edit]

Solomonoff's induction has been argued to be the computational formalization of pure Bayesianism.[3] To understand, recall that Bayesianism derives the posterior probability of a theory given data by applying Bayes rule, which yields

where theories are alternatives to theory . For this equation to make sense, the quantities and must be well-defined for all theories and . In other words, any theory must define a probability distribution over observable data . Solomonoff's induction essentially boils down to demanding that all such probability distributions be computable.

Interestingly, the set of computable probability distributions is a subset of the set of all programs, which is countable. Similarly, the sets of observable data considered by Solomonoff were finite. Without loss of generality, we can thus consider that any observable data is a finite bit string. As a result, Solomonoff's induction can be defined by only invoking discrete probability distributions.

Solomonoff's induction then allows to make probabilistic predictions of future data , by simply obeying the laws of probability. Namely, we have . This quantity can be interpreted as the average predictions of all theories given past data , weighted by their posterior credences .

Mathematical[edit]

The proof of the "razor" is based on the known mathematical properties of a probability distribution over a countable set. These properties are relevant because the infinite set of all programs is a denumerable set. The sum S of the probabilities of all programs must be exactly equal to one (as per the definition of probability) thus the probabilities must roughly decrease as we enumerate the infinite set of all programs, otherwise S will be strictly greater than one. To be more precise, for every > 0, there is some length l such that the probability of all programs longer than l is at most . This does not, however, preclude very long programs from having very high probability.

Fundamental ingredients of the theory are the concepts of algorithmic probability and Kolmogorov complexity. The universal prior probability of any prefix p of a computable sequence x is the sum of the probabilities of all programs (for a universal computer) that compute something starting with p. Given some p and any computable but unknown probability distribution from which x is sampled, the universal prior and Bayes' theorem can be used to predict the yet unseen parts of x in optimal fashion.

Mathematical guarantees[edit]

Solomonoff's completeness[edit]

The remarkable property of Solomonoff's induction is its completeness. In essence, the completeness theorem guarantees that the expected cumulative errors made by the predictions based on Solomonoff's induction are upper-bounded by the Kolmogorov complexity of the (stochastic) data generating process. The errors can be measured using the Kullback–Leibler divergence or the square of the difference between the induction's prediction and the probability assigned by the (stochastic) data generating process.

Solomonoff's uncomputability[edit]

Unfortunately, Solomonoff also proved that Solomonoff's induction is uncomputable. In fact, he showed that computability and completeness are mutually exclusive: any complete theory must be uncomputable. The proof of this is derived from a game between the induction and the environment. Essentially, any computable induction can be tricked by a computable environment, by choosing the computable environment that negates the computable induction's prediction. This fact can be regarded as an instance of the no free lunch theorem.

Modern applications[edit]

Artificial intelligence[edit]

Though Solomonoff's inductive inference is not computable, several AIXI-derived algorithms approximate it in order to make it run on a modern computer. The more computing power they are given, the closer their predictions are to the predictions of inductive inference (their mathematical limit is Solomonoff's inductive inference).[11][12][13]

Another direction of inductive inference is based on E. Mark Gold's model of learning in the limit from 1967 and has developed since then more and more models of learning.[14] The general scenario is the following: Given a class S of computable functions, is there a learner (that is, recursive functional) which for any input of the form (f(0),f(1),...,f(n)) outputs a hypothesis (an index e with respect to a previously agreed on acceptable numbering of all computable functions; the indexed function may be required consistent with the given values of f). A learner M learns a function f if almost all its hypotheses are the same index e, which generates the function f; M learns SifM learns every finS. Basic results are that all recursively enumerable classes of functions are learnable while the class REC of all computable functions is not learnable.[citation needed] Many related models have been considered and also the learning of classes of recursively enumerable sets from positive data is a topic studied from Gold's pioneering paper in 1967 onwards. A far reaching extension of the Gold’s approach is developed by Schmidhuber's theory of generalized Kolmogorov complexities,[15] which are kinds of super-recursive algorithms.

See also[edit]

References[edit]

  1. ^ Zenil, Hector (2011-02-11). Randomness Through Computation: Some Answers, More Questions. World Scientific. ISBN 978-981-4462-63-1.
  • ^ a b Solomonoff, Ray J. (2009), Emmert-Streib, Frank; Dehmer, Matthias (eds.), "Algorithmic Probability: Theory and Applications", Information Theory and Statistical Learning, Boston, MA: Springer US, pp. 1–23, doi:10.1007/978-0-387-84816-7_1, ISBN 978-0-387-84816-7, retrieved 2020-07-21
  • ^ a b Hoang, Lê Nguyên (2020). The equation of knowledge : from Bayes' rule to a unified philosophy of science (First ed.). Boca Raton, FL. ISBN 978-0-367-85530-7. OCLC 1162366056.{{cite book}}: CS1 maint: location missing publisher (link)
  • ^ a b JJ McCall. Induction: From Kolmogorov and Solomonoff to De Finetti and Back to Kolmogorov – Metroeconomica, 2004 – Wiley Online Library.
  • ^ a b D Stork. Foundations of Occam's razor and parsimony in learning from ricoh.com – NIPS 2001 Workshop, 2001
  • ^ a b A.N. Soklakov. Occam's razor as a formal basis for a physical theory from arxiv.org – Foundations of Physics Letters, 2002 – Springer
  • ^ a b Jose Hernandez-Orallo (1999). "Beyond the Turing Test" (PDF). Journal of Logic, Language and Information. 9.
  • ^ a b M Hutter. On the existence and convergence of computable universal priors arxiv.org – Algorithmic Learning Theory, 2003 – Springer
  • ^ Samuel Rathmanner and Marcus Hutter. A philosophical treatise of universal induction. Entropy, 13(6):1076–1136, 2011
  • ^ Ming Li and Paul Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag, N.Y., 2008p 339 ff.
  • ^ J. Veness, K.S. Ng, M. Hutter, W. Uther, D. Silver. "A Monte Carlo AIXI Approximation" – Arxiv preprint, 2009 arxiv.org
  • ^ J. Veness, K.S. Ng, M. Hutter, D. Silver. "Reinforcement Learning via AIXI Approximation" Arxiv preprint, 2010 – aaai.org
  • ^ S. Pankov. A computational approximation to the AIXI model from agiri.org – Artificial general intelligence, 2008: proceedings of …, 2008 – books.google.com
  • ^ Gold, E. Mark (1967). "Language identification in the limit" (PDF). Information and Control. 10 (5): 447–474. doi:10.1016/S0019-9958(67)91165-5.
  • ^ J. Schmidhuber (2002). "Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in the limit" (PDF). International Journal of Foundations of Computer Science. 13 (4): 587–612. doi:10.1142/S0129054102001291.
  • Sources[edit]

    External links[edit]


    Retrieved from "https://en.wikipedia.org/w/index.php?title=Solomonoff%27s_theory_of_inductive_inference&oldid=1231481508"

    Categories: 
    Statistical inference
    Inductive reasoning
    Machine learning
    Bayesian statistics
    Algorithmic information theory
    Hidden categories: 
    CS1 maint: location missing publisher
    Articles with short description
    Short description is different from Wikidata
    Articles lacking reliable references from February 2024
    All articles lacking reliable references
    Wikipedia articles needing clarification from June 2017
    All Wikipedia articles needing clarification
    Articles needing additional references from June 2017
    All articles needing additional references
    Articles with multiple maintenance issues
    All articles with unsourced statements
    Articles with unsourced statements from January 2014
     



    This page was last edited on 28 June 2024, at 14:33 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki