Added secondary sources like page mentions, but still needs more. Especially for the detailed equations.
|
Cite CE.
|
||
(38 intermediate revisions by 20 users not shown) | |||
Line 1: | Line 1: | ||
{{short description|Statistics concept}} |
|||
{{primary sources|date=August 2016}} |
{{primary sources|date=August 2016}} |
||
{{Bayesian statistics}}{{Probability fundamentals}} |
{{Bayesian statistics}}{{Probability fundamentals}} |
||
'''Bayesian programming''' is a formalism and a methodology for having a technique to specify [[Probability distribution|probabilistic models]] and solve problems when less than the necessary information is available. [[Bayes' theorem|Bayes’ Theorem]] is the central concept behind this programming approach, which states that the probability of something occurring in the future can be inferred by past conditions related to the event.<ref>{{Cite web|url=https://deepai.org/machine-learning-glossary-and-terms/bayesian-programming|title=Bayesian versus Frequentist Probability|last=|first=|date=|website=deepai.org|archive-url=|archive-date=|dead-url=|access-date=}}</ref> |
|||
'''Bayesian programming''' is a formalism and a methodology for having a technique to specify [[Probability distribution|probabilistic models]] and solve problems when less than the necessary information is available. |
|||
[[Edwin Thompson Jaynes|Edwin T. Jaynes]] proposed that probability could be considered as an alternative and an extension of logic for rational reasoning with incomplete and uncertain information. In his founding book ''Probability Theory: The Logic of Science''<ref name="Jaynes2003">{{cite book|first=E. T. |last=Jaynes|title=Probability Theory: The Logic of Science|url={{google books |plainurl=y |id=UjsgAwAAQBAJ}}|date=10 April 2003|publisher=Cambridge University Press|isbn=978-1-139-43516-1}}</ref> he developed this theory and proposed what he called “the robot,” which was not |
[[Edwin Thompson Jaynes|Edwin T. Jaynes]] proposed that probability could be considered as an alternative and an extension of logic for rational reasoning with incomplete and uncertain information. In his founding book ''Probability Theory: The Logic of Science''<ref name="Jaynes2003">{{cite book|first=E. T. |last=Jaynes|title=Probability Theory: The Logic of Science|url={{google books |plainurl=y |id=UjsgAwAAQBAJ}}|date=10 April 2003|publisher=Cambridge University Press|isbn=978-1-139-43516-1}}</ref> he developed this theory and proposed what he called “the robot,” which was not |
||
a physical device, but an inference engine to automate probabilistic reasoning—a kind of [[Prolog]] for probability instead of logic. Bayesian programming<ref name="BessiereMazer2013">{{cite book|first1=Pierre |last1=Bessiere|first2=Emmanuel |last2=Mazer|first3=Juan |last3=Manuel Ahuactzin|first4=Kamel |last4=Mekhnacha|title=Bayesian Programming|url={{google books |plainurl=y |id=4XtcAgAAQBAJ}}|date=20 December 2013|publisher=CRC Press|isbn=978-1-4398-8032-6}}</ref> is a formal and concrete implementation of this "robot". |
a physical device, but an [[inference engine]] to automate probabilistic reasoning—a kind of [[Prolog]] for probability instead of logic. Bayesian programming<ref name="BessiereMazer2013">{{cite book|first1=Pierre |last1=Bessiere|first2=Emmanuel |last2=Mazer|first3=Juan |last3=Manuel Ahuactzin|first4=Kamel |last4=Mekhnacha|title=Bayesian Programming|url={{google books |plainurl=y |id=4XtcAgAAQBAJ}}|date=20 December 2013|publisher=CRC Press|isbn=978-1-4398-8032-6}}</ref> is a formal and concrete implementation of this "robot". |
||
Bayesian programming may also be seen as an algebraic formalism to specify [[graphical model]]s such as, for instance, [[Bayesian network]]s, [[dynamic Bayesian network]]s, [[Kalman filter]]s or [[hidden Markov model]]s. Indeed, Bayesian Programming is more general than [[Bayesian network]]s and has a power of expression equivalent to probabilistic [[factor graph]]s.<ref>{{Cite web|url=http://bcf.usc.edu/~rosenblo/Pubs/agi15_demski.pdf|title=Expression Graphs: Unifying Factor Graphs and Sum-Product Networks |
Bayesian programming may also be seen as an algebraic formalism to specify [[graphical model]]s such as, for instance, [[Bayesian network]]s, [[dynamic Bayesian network]]s, [[Kalman filter]]s or [[hidden Markov model]]s. Indeed, Bayesian Programming is more general than [[Bayesian network]]s and has a power of expression equivalent to probabilistic [[factor graph]]s.<ref>{{Cite web|url=http://bcf.usc.edu/~rosenblo/Pubs/agi15_demski.pdf|title=Expression Graphs: Unifying Factor Graphs and Sum-Product Networks|website=bcf.usc.edu}}</ref> |
||
== Formalism == |
== Formalism == |
||
Line 13: | Line 14: | ||
A Bayesian program is a means of specifying a family of probability distributions. |
A Bayesian program is a means of specifying a family of probability distributions. |
||
The constituent elements of a Bayesian program are presented below:<ref>{{Cite web|url=https://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/lecture-notes/MIT15_097S12_lec15.pdf|title=Probabilistic Modeling and Bayesian Analysis |
The constituent elements of a Bayesian program are presented below:<ref>{{Cite web|url=https://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/lecture-notes/MIT15_097S12_lec15.pdf|title=Probabilistic Modeling and Bayesian Analysis|website=ocw.mit.edu}}</ref> |
||
: <math> |
: <math> |
||
Line 42: | Line 43: | ||
The purpose of a description is to specify an effective method of computing a [[joint probability distribution]] |
The purpose of a description is to specify an effective method of computing a [[joint probability distribution]] |
||
on a set of [[Random variable|variables]] <math>\left\{ X_{1},X_{2},\cdots,X_{N}\right\}</math> given a set of experimental data <math>\delta</math> and some |
on a set of [[Random variable|variables]] <math>\left\{ X_{1},X_{2},\cdots,X_{N}\right\}</math> given a set of experimental data <math>\delta</math> and some |
||
specification <math>\pi</math>. This [[Joint probability distribution|joint distribution]] is denoted as: <math>P\left(X_{1}\wedge X_{2}\wedge\cdots\wedge X_{N}\mid\delta\wedge\pi\right)</math>.<ref>{{Cite web|url=http://www.cs.brandeis.edu/~cs134/K_F_Ch3.pdf|title=Bayesian Networks |
specification <math>\pi</math>. This [[Joint probability distribution|joint distribution]] is denoted as: <math>P\left(X_{1}\wedge X_{2}\wedge\cdots\wedge X_{N}\mid\delta\wedge\pi\right)</math>.<ref>{{Cite web|url=http://www.cs.brandeis.edu/~cs134/K_F_Ch3.pdf|title=Bayesian Networks|website=cs.brandeis.edu}}</ref> |
||
To specify preliminary knowledge <math>\pi</math>, the programmer must undertake the following: |
To specify preliminary knowledge <math>\pi</math>, the programmer must undertake the following: |
||
Line 64: | Line 65: | ||
</math> |
</math> |
||
[[ |
[[Conditional independence]] hypotheses then allow further simplifications. A conditional |
||
independence hypothesis for variable <math>L_{k}</math> is defined by choosing some variable <math>X_{n}</math> |
independence hypothesis for variable <math>L_{k}</math> is defined by choosing some variable <math>X_{n}</math> |
||
among the variables appearing in the conjunction <math>L_{k-1}\wedge\cdots\wedge L_{2}\wedge L_{1}</math>, labelling <math>R_{k}</math> as the |
among the variables appearing in the conjunction <math>L_{k-1}\wedge\cdots\wedge L_{2}\wedge L_{1}</math>, labelling <math>R_{k}</math> as the |
||
Line 165: | Line 166: | ||
from experience. Starting from an initial standard setting, the classifier should |
from experience. Starting from an initial standard setting, the classifier should |
||
modify its internal parameters when the user disagrees with its own decision. |
modify its internal parameters when the user disagrees with its own decision. |
||
It will hence adapt to the |
It will hence adapt to the user's criteria to differentiate between non-spam and |
||
spam. It will improve its results as it encounters increasingly classified e-mails. |
spam. It will improve its results as it encounters increasingly classified e-mails. |
||
Line 172: | Line 173: | ||
The variables necessary to write this program are as follows: |
The variables necessary to write this program are as follows: |
||
# <math>Spam</math>: a binary variable, false if the e-mail is not spam and true otherwise. |
# <math>Spam</math>: a binary variable, false if the e-mail is not spam and true otherwise. |
||
# <math>W_0,W_1, \ldots, W_{N-1}</math>: <math>N</math> binary variables. <math>W_n</math> is true if the <math>n^{th}</math> word of the dictionary is present in the text. |
# <math>W_0,W_1, \ldots, W_{N-1}</math>: <math>N</math> [[binary data|binary variables]]. <math>W_n</math> is true if the <math>n^{th}</math> word of the dictionary is present in the text. |
||
These <math>N + 1</math> binary variables sum up all the information |
These <math>N + 1</math> binary variables sum up all the information |
||
Line 206: | Line 207: | ||
</math> |
</math> |
||
This kind of assumption is known as the [[Naive Bayes classifier|naive Bayes' assumption]]. It |
This kind of assumption is known as the [[Naive Bayes classifier|naive Bayes' assumption]]. It is "naive" in the sense that the independence between words is clearly not completely true. For instance, it completely neglects that the appearance of pairs of words may be more significant than isolated appearances. However, the programmer may assume this hypothesis and may develop the model and the associated inferences to test how reliable and efficient it is. |
||
==== Parametric forms ==== |
==== Parametric forms ==== |
||
Line 226: | Line 227: | ||
The identification of these parameters could be done either by batch processing a series of classified e-mails or by an incremental updating of the parameters using the user's classifications of the e-mails as they arrive. |
The identification of these parameters could be done either by batch processing a series of classified e-mails or by an incremental updating of the parameters using the user's classifications of the e-mails as they arrive. |
||
Both methods could be combined: the system could start with initial standard values of these parameters issued from a generic database, then some incremental learning customizes the classifier to each individual user. |
Both methods could be combined: the system could start with initial standard values of these parameters issued from a generic database, then some [[incremental learning]] customizes the classifier to each individual user. |
||
==== Question ==== |
==== Question ==== |
||
Line 316: | Line 317: | ||
The most common case is Bayesian filtering where <math>k=0</math>, which searches for the present state, knowing past observations. |
The most common case is Bayesian filtering where <math>k=0</math>, which searches for the present state, knowing past observations. |
||
However it is also possible <math>(k>0)</math>, to extrapolate a future state from past observations, or to do smoothing <math>(k<0)</math>, to recover a past state from observations made either before or after that instant. |
However, it is also possible <math>(k>0)</math>, to extrapolate a future state from past observations, or to do smoothing <math>(k<0)</math>, to recover a past state from observations made either before or after that instant. |
||
More complicated questions may also be asked as shown below in the HMM section. |
More complicated questions may also be asked as shown below in the HMM section. |
||
Bayesian filters <math>(k=0)</math> have a very interesting recursive property, which contributes greatly to their attractiveness. <math>P\left(S^{t}|O^{0}\wedge\cdots\wedge O^{t}\right)</math> may be computed simply from <math>P\left(S^{t |
Bayesian filters <math>(k=0)</math> have a very interesting recursive property, which contributes greatly to their attractiveness. <math>P\left(S^{t}|O^{0}\wedge\cdots\wedge O^{t}\right)</math> may be computed simply from <math>P\left(S^{t-1}\mid O^0 \wedge \cdots \wedge O^{t-1}\right)</math> with the following formula: |
||
: <math> |
: <math> |
||
Line 374: | Line 375: | ||
====Kalman filter==== |
====Kalman filter==== |
||
The very well-known [[Kalman filter]]s<ref>{{cite journal|last=Kalman|first=R. E.|title=A New Approach to Linear Filtering and Prediction Problems|journal= |
The very well-known [[Kalman filter]]s<ref>{{cite journal|last=Kalman|first=R. E.|s2cid=1242324|title=A New Approach to Linear Filtering and Prediction Problems|journal= Journal of Basic Engineering|year=1960|volume=82|pages=33–45|doi=10.1115/1.3662552}}</ref> are a special case of Bayesian |
||
filters. |
filters. |
||
Line 462: | Line 463: | ||
==== Robotics ==== |
==== Robotics ==== |
||
In robotics, bayesian programming was applied to [[autonomous robotics]],<ref>{{cite journal|last=Lebeltel|first=O. |author2=Bessière, P. |author3=Diard, J. |author4=Mazer, E.|title=Bayesian Robot Programming|journal=Advanced Robotics|year=2004|volume=16|issue=1|pages=49–79|doi=10.1023/b:auro.0000008671.38949.43}}</ref><ref>{{cite journal|last=Diard|first=J. |author2=Gilet, E. |author3=Simonin, E. |author4=Bessière, P.|title=Incremental learning of Bayesian sensorimotor models: from low-level behaviours to large-scale structure of the environment|journal=Connection Science|year=2010|volume=22|issue=4|pages=291–312|doi=10.1080/09540091003682561|bibcode=2010ConSc..22..291D }}</ref><ref>{{cite journal|last=Pradalier|first=C. |author2=Hermosillo, J. |author3=Koike, C. |author4=Braillon, C. |author5=Bessière, P. |author6=Laugier, C.|title=The CyCab: a car-like robot navigating autonomously and safely among pedestrians|journal=Robotics and Autonomous Systems|year=2005|volume=50|issue=1|pages=51–68|doi=10.1016/j.robot.2004.10.002|citeseerx=10.1.1.219.69 }}</ref><ref>{{cite journal|last=Ferreira|first=J. |author2=Lobo, J. |author3=Bessière, P. |author4=Castelo-Branco, M. |author5=Dias, J.|title=A Bayesian Framework for Active Artificial Perception|journal= |
In robotics, bayesian programming was applied to [[autonomous robotics]],<ref>{{cite journal|last=Lebeltel|first=O. |author2=Bessière, P. |author3=Diard, J. |author4=Mazer, E.|title=Bayesian Robot Programming|journal=Advanced Robotics|year=2004|volume=16|issue=1|pages=49–79|doi=10.1023/b:auro.0000008671.38949.43|s2cid=18768468 |url=http://cogprints.org/1670/5/Lebeltel2000.pdf }}</ref><ref>{{cite journal|last=Diard|first=J. |author2=Gilet, E. |author3=Simonin, E. |author4=Bessière, P.|title=Incremental learning of Bayesian sensorimotor models: from low-level behaviours to large-scale structure of the environment|journal=Connection Science|year=2010|volume=22|issue=4|pages=291–312|doi=10.1080/09540091003682561|bibcode=2010ConSc..22..291D |s2cid=216035458 |url=https://hal.archives-ouvertes.fr/hal-00537809/file/diard10_author.pdf }}</ref><ref>{{cite journal|last=Pradalier|first=C. |author2=Hermosillo, J. |author3=Koike, C. |author4=Braillon, C. |author5=Bessière, P. |author6=Laugier, C.|title=The CyCab: a car-like robot navigating autonomously and safely among pedestrians|journal=Robotics and Autonomous Systems|year=2005|volume=50|issue=1|pages=51–68|doi=10.1016/j.robot.2004.10.002|citeseerx=10.1.1.219.69 }}</ref><ref>{{cite journal|last=Ferreira|first=J. |author2=Lobo, J. |author3=Bessière, P. |author4=Castelo-Branco, M. |author5=Dias, J.|title=A Bayesian Framework for Active Artificial Perception|journal=IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics |year=2012|volume=99|issue=2 |pages=1–13|doi=10.1109/TSMCB.2012.2214477 |pmid=23014760 |s2cid=1808051 |url=https://hal.archives-ouvertes.fr/hal-00747148/file/A_Bayesian_Framework_for_Active_Artificial_Perception.pdf }}</ref><ref name=Ferreira2014>{{cite book|last=Ferreira|first=J. F.|title=Probabilistic Approaches to Robotic Perception|year=2014|publisher=Springer|author2=Dias, J. M.|isbn= 978-3-319-02005-1}}</ref> robotic [[Computer-aided design|CAD]] systems,<ref>{{cite journal|last=Mekhnacha|first=K. |author2=Mazer, E. |author3=Bessière, P.|title=The design and implementation of a Bayesian CAD modeler for robotic applications|journal=Advanced Robotics|year=2001|volume=15|issue=1|pages=45–69|doi=10.1163/156855301750095578|citeseerx=10.1.1.552.3126 |s2cid=7920387 }}</ref> [[advanced driver-assistance systems]],<ref>{{cite journal|last=Coué|first=C. |author2=Pradalier, C. |author3=Laugier, C. |author4=Fraichard, T. |author5=Bessière, P.|title=Bayesian Occupancy Filtering for Multitarget Tracking: an Automotive Application|journal=International Journal of Robotics Research|year=2006|volume=25|issue=1|pages=19–30|doi=10.1177/0278364906061158|s2cid=13874685 |url=https://hal.inria.fr/inria-00182004/file/coue-etal-ijrr-06.pdf }}</ref> [[robotic arm]] control, [[mobile robot]]ics,<ref>{{cite journal|last=Vasudevan|first=S.|author2=Siegwart, R.|title=Bayesian space conceptualization and place classification for semantic maps in mobile robotics|journal=Robotics and Autonomous Systems|year=2008|volume=56|pages=522–537|doi=10.1016/j.robot.2008.03.005|issue=6|citeseerx=10.1.1.149.4189}}</ref><ref>{{cite journal|last=Perrin|first=X. |author2=Chavarriaga, R. |author3=Colas, F. |author4=Seigwart, R. |author5=Millan, J.|title=Brain-coupled interaction for semi-autonomous navigation of an assistive robot|journal=Robotics and Autonomous Systems|year=2010|volume=58|pages=1246–1255|doi=10.1016/j.robot.2010.05.010|issue=12|url=http://infoscience.epfl.ch/record/149091 }}</ref> human-robot interaction,<ref>{{cite journal|last=Rett|first=J. |author2=Dias, J. |author3=Ahuactzin, J-M. |title=Bayesian reasoning for Laban Movement Analysis used in human-machine interaction|journal=International Journal of Reasoning-Based Intelligent Systems|year=2010|volume=2|issue=1|pages=13–35|doi=10.1504/IJRIS.2010.029812|citeseerx=10.1.1.379.6216 }}</ref> human-vehicle interaction (Bayesian autonomous driver models)<ref> |
||
{{cite conference |
|||
{{Citation |
|||
| last1 = Möbus | first1 = C. |
| last1 = Möbus | first1 = C. |
||
| last2 = Eilers | first2 = M. |
| last2 = Eilers | first2 = M. |
||
Line 470: | Line 471: | ||
| editor-last = Duffy |
| editor-last = Duffy |
||
| editor-first = Vincent G. |
| editor-first = Vincent G. |
||
| title = Digital Human Modeling |
| book-title = Digital Human Modeling |
||
| volume = 5620 |
| volume = 5620 |
||
| |
| title = Probabilistic and Empirical Grounded Modeling of Agents in (Partial) Cooperative Traffic Scenarios |
||
⚫ | |||
| contribution-url = https://link.springer.com/chapter/10.1007%2F978-3-642-02809-0_45 |
|||
⚫ | |||
| year = 2009 |
| year = 2009 |
||
| pages = 423–432 |
| pages = 423–432 |
||
| |
| conference = Second International Conference, ICDHM 2009, San Diego, CA, USA |
||
| publisher = Springer |
| publisher = Springer |
||
| isbn = 978-3-642-02808-3 |
| isbn = 978-3-642-02808-3 |
||
| doi = 10.1007/978-3-642-02809-0_45 |
| doi = 10.1007/978-3-642-02809-0_45 | url = http://oops.uni-oldenburg.de/1844/1/PartialCooperative20090223_PCM.pdf |
||
| doi-access = free |
|||
}} |
|||
</ref><ref> |
</ref><ref> |
||
{{cite conference |
|||
{{Citation |
|||
| last1 = Möbus | first1 = C. |
| last1 = Möbus | first1 = C. |
||
| last2 = Eilers | first2 = M. |
| last2 = Eilers | first2 = M. |
||
| editor-last = Duffy |
| editor-last = Duffy |
||
| editor-first = Vincent G. |
| editor-first = Vincent G. |
||
| title = Digital Human Modeling |
| book-title = Digital Human Modeling |
||
| volume = 5620 |
| volume = 5620 |
||
| |
| title = Further Steps Towards Driver Modeling according to the Bayesian Programming Approach |
||
⚫ | |||
| contribution-url = https://link.springer.com/chapter/10.1007%2F978-3-642-02809-0_44 |
|||
⚫ | |||
| year = 2009 |
| year = 2009 |
||
| pages = 413–422 |
| pages = 413–422 |
||
| |
| conference = Second International Conference, ICDHM 2009, San Diego, CA, USA |
||
| publisher = Springer |
| publisher = Springer |
||
| isbn = 978-3-642-02808-3 |
| isbn = 978-3-642-02808-3 |
||
Line 503: | Line 504: | ||
| last = Eilers |
| last = Eilers |
||
| first = M. |
| first = M. |
||
| authorlink = |
|||
|author2=Möbus, C. |
|author2=Möbus, C. |
||
| title = Lernen eines modularen Bayesian Autonomous Driver Mixture-of-Behaviors (BAD MoB) Modells |
| title = Lernen eines modularen Bayesian Autonomous Driver Mixture-of-Behaviors (BAD MoB) Modells |
||
| |
| book-title = Fahrermodellierung - Zwischen kinematischen Menschmodellen und dynamisch-kognitiven Verhaltensmodellen |
||
| pages = 61–74 |
| pages = 61–74 |
||
| |
| editor1-last = Kolrep |
||
| |
| editor1-first = H. |
||
| |
| editor2-last = Jürgensohn |
||
| |
| editor2-first = Th. |
||
| publisher = VDI-Verlag |
| publisher = VDI-Verlag |
||
| series = Fortschrittsbericht des VDI in der Reihe 22 (Mensch-Maschine-Systeme) |
| series = Fortschrittsbericht des VDI in der Reihe 22 (Mensch-Maschine-Systeme) |
||
Line 518: | Line 518: | ||
| url = http://www.lks.uni-oldenburg.de/download/Publikationen/2010/Eilers&PCM2010_BFFM_BAD_MoB_Modells2010.pdf |
| url = http://www.lks.uni-oldenburg.de/download/Publikationen/2010/Eilers&PCM2010_BFFM_BAD_MoB_Modells2010.pdf |
||
| isbn = 978-3-18-303222-8 |
| isbn = 978-3-18-303222-8 |
||
}} |
|||
| issn = |
|||
| accessdate = }} |
|||
</ref><ref> |
</ref><ref> |
||
{{cite conference |
{{cite conference |
||
| last = Eilers |
| last = Eilers |
||
| first = M. |
| first = M. |
||
| authorlink = |
|||
|author2=Möbus, C. |
|author2=Möbus, C. |
||
| title = Learning the Relevant Percepts of Modular Hierarchical Bayesian Driver Models Using a Bayesian Information Criterion |
| title = Learning the Relevant Percepts of Modular Hierarchical Bayesian Driver Models Using a Bayesian Information Criterion |
||
| |
| book-title = Digital Human Modeling |
||
| pages = 463–472 |
| pages = 463–472 |
||
| editor-last = Duffy |
| editor-last = Duffy |
||
Line 538: | Line 536: | ||
| isbn = 978-3-642-21798-2 |
| isbn = 978-3-642-21798-2 |
||
| doi = 10.1007/978-3-642-21799-9_52 |
| doi = 10.1007/978-3-642-21799-9_52 |
||
| |
| doi-access = free |
||
}} |
|||
| accessdate = }} |
|||
</ref><ref> |
</ref><ref> |
||
{{cite conference |
{{cite conference |
||
| last = Eilers |
| last = Eilers |
||
| first = M. |
| first = M. |
||
| authorlink = |
|||
|author2=Möbus, C. |
|author2=Möbus, C. |
||
| title = Learning of a Bayesian Autonomous Driver Mixture-of-Behaviors (BAD-MoB) Model |
| title = Learning of a Bayesian Autonomous Driver Mixture-of-Behaviors (BAD-MoB) Model |
||
| |
| book-title = Advances in Applied Digital Human Modeling |
||
| pages = 436–445 |
| pages = 436–445 |
||
| editor-last = Duffy |
| editor-last = Duffy |
||
Line 558: | Line 555: | ||
| contribution-url = http://www.lks.uni-oldenburg.de/46350.html |
| contribution-url = http://www.lks.uni-oldenburg.de/46350.html |
||
| isbn = 978-1-4398-3511-1 |
| isbn = 978-1-4398-3511-1 |
||
}} |
|||
| accessdate = }} |
|||
</ref> [[video game]] avatar programming and training <ref>{{cite journal|last=Le Hy|first=R. |author2=Arrigoni, A. |author3=Bessière, P. |author4=Lebetel, O.|title=Teaching Bayesian Behaviours to Video Game Characters|journal=Robotics and Autonomous Systems|year=2004|volume=47|pages=177–185|doi=10.1016/j.robot.2004.03.012|issue=2–3}}</ref> and real-time strategy games (AI).<ref>{{cite book|title=Bayesian Programming and Learning for Multiplayer Video Games|last=Synnaeve|first=G.|year=2012|url=http://tel.archives-ouvertes.fr/docs/00/78/06/35/PDF/29588_SYNNAEVE_2012_archivage.pdf}}</ref> |
</ref> [[video game]] avatar programming and training <ref>{{cite journal|last=Le Hy|first=R. |author2=Arrigoni, A. |author3=Bessière, P. |author4=Lebetel, O.|title=Teaching Bayesian Behaviours to Video Game Characters|journal=Robotics and Autonomous Systems|year=2004|volume=47|pages=177–185|doi=10.1016/j.robot.2004.03.012|issue=2–3|s2cid=16415524 |url=http://cogprints.org/3744/1/lehy04.pdf }}</ref> and real-time strategy games (AI).<ref>{{cite book|title=Bayesian Programming and Learning for Multiplayer Video Games|last=Synnaeve|first=G.|year=2012|url=http://tel.archives-ouvertes.fr/docs/00/78/06/35/PDF/29588_SYNNAEVE_2012_archivage.pdf}}</ref> |
||
==== Life sciences ==== |
==== Life sciences ==== |
||
In life sciences, bayesian programming was used in vision to reconstruct shape from motion,<ref>{{cite journal|last=Colas|first=F. |author2=Droulez, J. |author3=Wexler, M. |author4=Bessière, P.|title=A unified probabilistic model of the perception of three-dimensional structure from optic flow|journal=Biological Cybernetics|volume=97 |issue=5–6 |year=2008|pages=461–77|doi=10.1007/s00422-007-0183-z|pmid=17987312 |citeseerx=10.1.1.215.1491 }}</ref> to model visuo-vestibular interaction<ref>{{cite journal|last=Laurens|first=J.|author2=Droulez, J.|title=Bayesian processing of vestibular information|journal=Biological Cybernetics|year=2007|volume=96|pages=389–404|doi=10.1007/s00422-006-0133-1|pmid=17146661|issue=4}}</ref> and to study saccadic eye movements;<ref>{{cite journal|last=Colas|first=F. |author2=Flacher, F. |author3=Tanner, T. |author4=Bessière, P. |author5=Girard, B.|title=Bayesian models of eye movement selection with retinotopic maps|journal=Biological Cybernetics|year=2009|volume=100|issue=3|pages=203–214|doi=10.1007/s00422-009-0292-y|pmid=19212780 }}</ref> in speech perception and control to study early speech acquisition<ref>{{cite journal|last=Serkhane|first=J. |author2=Schwartz, J-L. |author3=Bessière, P. |title=Building a talking baby robot A contribution to the study of speech acquisition and evolution|journal=Interaction Studies|year=2005|volume=6|issue=2|pages=253–286|doi=10.1075/is.6.2.06ser}}</ref> and the emergence of articulatory-acoustic systems;<ref>{{cite journal|last=Moulin-Frier|first=C. |author2=Laurent, R. |author3=Bessière, P. |author4=Schwartz, J-L. |author5=Diard, J. |title=Adverse conditions improve distinguishability of auditory, motor and percep-tuo-motor theories of speech perception: an exploratory Bayesian modeling study|journal=Language and Cognitive Processes|year=2012|volume=27|issue=7–8|pages=1240–1263|doi=10.1080/01690965.2011.645313}}</ref> and to model handwriting perception and control.<ref>{{cite journal|last=Gilet|first=E. |author2=Diard, J. |author3=Bessière, P.|title=Bayesian Action–Perception Computational Model: Interaction of Production and Recognition of Cursive Letters|journal= |
In life sciences, bayesian programming was used in vision to reconstruct shape from motion,<ref>{{cite journal|last=Colas|first=F. |author2=Droulez, J. |author3=Wexler, M. |author4=Bessière, P.|title=A unified probabilistic model of the perception of three-dimensional structure from optic flow|journal=Biological Cybernetics|volume=97 |issue=5–6 |year=2008|pages=461–77|doi=10.1007/s00422-007-0183-z|pmid=17987312 |citeseerx=10.1.1.215.1491 |s2cid=215821150 }}</ref> to model visuo-vestibular interaction<ref>{{cite journal|last=Laurens|first=J.|author2=Droulez, J.|title=Bayesian processing of vestibular information|journal=Biological Cybernetics|year=2007|volume=96|pages=389–404|doi=10.1007/s00422-006-0133-1|pmid=17146661|issue=4|s2cid=18138027}}</ref> and to study [[Saccade|saccadic]] eye movements;<ref>{{cite journal|last=Colas|first=F. |author2=Flacher, F. |author3=Tanner, T. |author4=Bessière, P. |author5=Girard, B.|title=Bayesian models of eye movement selection with retinotopic maps|journal=Biological Cybernetics|year=2009|volume=100|issue=3|pages=203–214|doi=10.1007/s00422-009-0292-y|pmid=19212780 |s2cid=5906668 |url=https://hal.archives-ouvertes.fr/hal-00384515/file/main.pdf|doi-access=free}}</ref> in speech perception and control to study early [[speech acquisition]]<ref>{{cite journal|last=Serkhane|first=J. |author2=Schwartz, J-L. |author3=Bessière, P. |title=Building a talking baby robot A contribution to the study of speech acquisition and evolution|journal=Interaction Studies|year=2005|volume=6|issue=2|pages=253–286|doi=10.1075/is.6.2.06ser|url=https://hal.archives-ouvertes.fr/hal-00186575/file/Serkhane_Interaction_Studies_2005.pdf }}</ref> and the emergence of articulatory-acoustic systems;<ref>{{cite journal|last=Moulin-Frier|first=C. |author2=Laurent, R. |author3=Bessière, P. |author4=Schwartz, J-L. |author5=Diard, J. |title=Adverse conditions improve distinguishability of auditory, motor and percep-tuo-motor theories of speech perception: an exploratory Bayesian modeling study|journal=Language and Cognitive Processes|year=2012|volume=27|issue=7–8|pages=1240–1263|doi=10.1080/01690965.2011.645313|s2cid=55504109 |url=https://hal.archives-ouvertes.fr/hal-01059179/file/moulin-frier12.pdf }}</ref> and to model handwriting perception and control.<ref>{{cite journal|last=Gilet|first=E. |author2=Diard, J. |author3=Bessière, P.|title=Bayesian Action–Perception Computational Model: Interaction of Production and Recognition of Cursive Letters|journal=PLOS ONE|year=2011|volume=6|issue=6|page=e20387|bibcode=2011PLoSO...620387G|doi=10.1371/journal.pone.0020387|editor1-last=Sporns|editor1-first=Olaf|pmid=21674043|pmc=3106017|doi-access=free }}</ref> |
||
=== Pattern recognition === |
=== Pattern recognition === |
||
Bayesian program learning has potential applications voice recognition and synthesis, image recognition and natural language processing. It employs the principles of ''compositionality'' (building abstract representations from parts), ''causality'' (building complexity from parts) and ''learning to learn'' (using previously recognized concepts to ease the creation of new concepts).<ref>{{Cite web|title = New algorithm helps machines learn as quickly as humans|url = http://www.gizmag.com/artificial-intelligence-algorithm-learning/41448|website = www.gizmag.com|access-date = 2016-01-23|date = 2016-01-22}}</ref> |
Bayesian program learning has potential applications [[Speech recognition|voice recognition]] and synthesis, [[image recognition]] and natural language processing. It employs the principles of ''compositionality'' (building abstract representations from parts), ''causality'' (building complexity from parts) and ''learning to learn'' (using previously recognized concepts to ease the creation of new concepts).<ref>{{Cite web|title = New algorithm helps machines learn as quickly as humans|url = http://www.gizmag.com/artificial-intelligence-algorithm-learning/41448|website = www.gizmag.com|access-date = 2016-01-23|date = 2016-01-22}}</ref> |
||
== Possibility theories == |
== Possibility theories == |
||
Line 572: | Line 569: | ||
The comparison between probabilistic approaches (not only bayesian programming) and possibility theories continues to be debated. |
The comparison between probabilistic approaches (not only bayesian programming) and possibility theories continues to be debated. |
||
Possibility theories like, for instance, [[fuzzy set]]s,<ref>{{cite |
Possibility theories like, for instance, [[fuzzy set]]s,<ref>{{cite q| Q25938993 |last1=Zadeh |first1=L.A. | author-link1 = Lotfi A. Zadeh | journal = [[Information and Computation|Information and Control]] | doi-access = free }}</ref> [[fuzzy logic]]<ref>{{cite q| Q57275767 |last1=Zadeh |first1=L.A. | author-link1 = Lotfi A. Zadeh | publisher = [[Springer Science+Business Media|Springer]] }}</ref> and [[possibility theory]]<ref>{{cite journal|last=Dubois|first=D.|author2=Prade, H.|title=Possibility Theory, Probability Theory and Multiple-Valued Logics: A Clarification|journal=Ann. Math. Artif. Intell.|year=2001|volume=32|issue=1–4|pages=35–66|doi=10.1023/A:1016740830286|s2cid=10271476|url=ftp://ftp.irit.fr/IRIT/ADRIA/AMAI-Dub.Pra.revised.pdf}}</ref> are alternatives to probability to model uncertainty. They argue that probability is insufficient or inconvenient to model certain aspects of incomplete/uncertain knowledge. |
||
The defense of probability is mainly based on [[Cox's theorem]], which starts from four postulates concerning rational reasoning in the presence of uncertainty. It demonstrates that the only mathematical framework that satisfies these postulates is probability theory. The argument is that any approach other than probability necessarily infringes one of these postulates and the value of that infringement. |
The defense of probability is mainly based on [[Cox's theorem]], which starts from four postulates concerning rational reasoning in the presence of uncertainty. It demonstrates that the only mathematical framework that satisfies these postulates is probability theory. The argument is that any approach other than probability necessarily infringes one of these postulates and the value of that infringement. |
||
Line 580: | Line 577: | ||
The purpose of [[Probabilistic relational programming language|probabilistic programming]] is to unify the scope of classical programming languages with probabilistic modeling (especially [[bayesian network]]s) to deal with uncertainty while profiting from the programming languages' expressiveness to encode complexity. |
The purpose of [[Probabilistic relational programming language|probabilistic programming]] is to unify the scope of classical programming languages with probabilistic modeling (especially [[bayesian network]]s) to deal with uncertainty while profiting from the programming languages' expressiveness to encode complexity. |
||
Extended classical programming languages include logical languages as proposed in [[Abductive logic programming|Probabilistic Horn Abduction]],<ref>{{cite journal|last=Poole|first=D.|title=Probabilistic Horn abduction and Bayesian networks|journal=Artificial Intelligence|year=1993|volume=64|pages=81–129|doi=10.1016/0004-3702(93)90061-F}}</ref> Independent Choice Logic,<ref>{{cite journal|last=Poole|first=D.|title=The Independent Choice Logic for modelling multiple agents under uncertainty|journal= |
Extended classical programming languages include logical languages as proposed in [[Abductive logic programming|Probabilistic Horn Abduction]],<ref>{{cite journal|last=Poole|first=D.|title=Probabilistic Horn abduction and Bayesian networks|journal=Artificial Intelligence|year=1993|volume=64|pages=81–129|doi=10.1016/0004-3702(93)90061-F}}</ref> Independent Choice Logic,<ref>{{cite journal|last=Poole|first=D.|title=The Independent Choice Logic for modelling multiple agents under uncertainty|journal=Artificial Intelligence|year=1997|volume=94|issue=1–2|pages=7–56|doi=10.1016/S0004-3702(97)00027-1|doi-access=free}}</ref> PRISM,<ref>{{cite journal|last=Sato|first=T.|author2=Kameya, Y.|title=Parameter learning of logic programs for symbolic-statistical modeling|journal=Journal of Artificial Intelligence Research|year=2001|volume=15|issue=2001|pages=391–454|url=http://www.jair.org/media/912/live-912-2013-jair.pdf|bibcode=2011arXiv1106.1797S|arxiv=1106.1797|doi=10.1613/jair.912|s2cid=7857569|access-date=2015-10-18|archive-url=https://web.archive.org/web/20140712033447/http://www.jair.org/media/912/live-912-2013-jair.pdf|archive-date=2014-07-12|url-status=dead}}</ref> and ProbLog which proposes an extension of Prolog. |
||
It can also be extensions of [[Functional programming|functional programming languages]] (essentially [[Lisp (programming language)|Lisp]] and [[Scheme (programming language)|Scheme]]) such as IBAL or CHURCH. The underlying programming languages can be object-oriented as in BLOG and FACTORIE or more standard ones as in CES and FIGARO.<ref>{{github|p2t2/figaro}}</ref> |
It can also be extensions of [[Functional programming|functional programming languages]] (essentially [[Lisp (programming language)|Lisp]] and [[Scheme (programming language)|Scheme]]) such as IBAL or CHURCH. The underlying programming languages can be object-oriented as in BLOG and FACTORIE or more standard ones as in CES and FIGARO.<ref>{{github|p2t2/figaro}}</ref> |
||
Line 590: | Line 587: | ||
== See also == |
== See also == |
||
{{Portal| |
{{Portal|Mathematics}} |
||
{{columns-list|colwidth= |
{{columns-list|colwidth=20em| |
||
* [[Bayes' rule]] |
* [[Bayes' rule]] |
||
* [[Bayesian inference]] |
* [[Bayesian inference]] |
||
Line 618: | Line 615: | ||
== External links == |
== External links == |
||
* [http://www.probayes.com/Bayesian-Programming-Book A companion site to the ''Bayesian programming'' book where to download ProBT an inference engine dedicated to Bayesian programming.] |
* [https://archive.today/20131123162733/http://www.probayes.com/Bayesian-Programming-Book A companion site to the ''Bayesian programming'' book where to download ProBT an inference engine dedicated to Bayesian programming.] |
||
* The [http://Bayesian-programming.org Bayesian-programming.org site] for the promotion of Bayesian programming with detailed information and numerous publications. |
* The [http://Bayesian-programming.org Bayesian-programming.org site] {{Webarchive|url=https://archive.today/20131123162815/http://bayesian-programming.org/ |date=2013-11-23 }} for the promotion of Bayesian programming with detailed information and numerous publications. |
||
[[Category:Bayesian statistics]] |
[[Category:Bayesian statistics]] |
This article relies excessively on referencestoprimary sources. Please improve this article by adding secondary or tertiary sources.
Find sources: "Bayesian programming" – news · newspapers · books · scholar · JSTOR (August 2016) (Learn how and when to remove this message) |
Part of a series on |
Bayesian statistics |
---|
![]() |
Posterior = Likelihood × Prior ÷ Evidence |
Background |
Model building |
Posterior approximation |
Estimators |
Evidence approximation |
Model evaluation |
|
Part of a series on statistics |
Probability theory |
---|
![]() |
|
|
Bayesian programming is a formalism and a methodology for having a technique to specify probabilistic models and solve problems when less than the necessary information is available.
Edwin T. Jaynes proposed that probability could be considered as an alternative and an extension of logic for rational reasoning with incomplete and uncertain information. In his founding book Probability Theory: The Logic of Science[1] he developed this theory and proposed what he called “the robot,” which was not a physical device, but an inference engine to automate probabilistic reasoning—a kind of Prolog for probability instead of logic. Bayesian programming[2] is a formal and concrete implementation of this "robot".
Bayesian programming may also be seen as an algebraic formalism to specify graphical models such as, for instance, Bayesian networks, dynamic Bayesian networks, Kalman filtersorhidden Markov models. Indeed, Bayesian Programming is more general than Bayesian networks and has a power of expression equivalent to probabilistic factor graphs.[3]
A Bayesian program is a means of specifying a family of probability distributions.
The constituent elements of a Bayesian program are presented below:[4]
The purpose of a description is to specify an effective method of computing a joint probability distribution
on a set of variables given a set of experimental data
and some
specification
. This joint distribution is denoted as:
.[5]
To specify preliminary knowledge , the programmer must undertake the following:
Given a partition of containing
subsets,
variables are defined
, each corresponding to one of these subsets.
Each variable
is obtained as the conjunction of the variables
belonging to the
subset. Recursive application of Bayes' theorem leads to:
Conditional independence hypotheses then allow further simplifications. A conditional
independence hypothesis for variable is defined by choosing some variable
among the variables appearing in the conjunction
, labelling
as the
conjunction of these chosen variables and setting:
We then obtain:
Such a simplification of the joint distribution as a product of simpler distributions is called a decomposition, derived using the chain rule.
This ensures that each variable appears at the most once on the left of a conditioning bar, which is the necessary and sufficient condition to write mathematically valid decompositions.[citation needed]
Each distribution appearing in the product is then associated
with either a parametric form (i.e., a function
) or a question to another Bayesian program
.
When it is a form , in general,
is a vector of parameters that may depend on
or
or both. Learning
takes place when some of these parameters are computed using the data set
.
An important feature of Bayesian Programming is this capacity to use questions to other Bayesian programs as components of the definition of a new Bayesian program. is obtained by some inferences done by another Bayesian program defined by the specifications
and the data
. This is similar to calling a subroutine in classical programming and provides an easy way to build hierarchical models.
Given a description (i.e., ), a question is obtained by partitioning
into three sets: the searched variables, the known variables and
the free variables.
The 3 variables ,
and
are defined as the
conjunction of the variables belonging to
these sets.
A question is defined as the set of distributions:
made of many "instantiated questions" as the cardinal of ,
each instantiated question being the distribution:
Given the joint distribution , it is always possible to compute any possible question using the following general inference:
where the first equality results from the marginalization rule, the second
results from Bayes' theorem and the third corresponds to a second application of marginalization. The denominator appears to be a normalization term and can be replaced by a constant .
Theoretically, this allows to solve any Bayesian inference problem. In practice,
however, the cost of computing exhaustively and exactly is too great in almost all cases.
Replacing the joint distribution by its decomposition we get:
which is usually a much simpler expression to compute, as the dimensionality of the problem is considerably reduced by the decomposition into a product of lower dimension distributions.
The purpose of Bayesian spam filtering is to eliminate junk e-mails.
The problem is very easy to formulate. E-mails should be classified into one of two categories: non-spam or spam. The only available information to classify the e-mails is their content: a set of words. Using these words without taking the order into account is commonly called a bag of words model.
The classifier should furthermore be able to adapt to its user and to learn from experience. Starting from an initial standard setting, the classifier should modify its internal parameters when the user disagrees with its own decision. It will hence adapt to the user's criteria to differentiate between non-spam and spam. It will improve its results as it encounters increasingly classified e-mails.
The variables necessary to write this program are as follows:
These binary variables sum up all the information
about an e-mail.
Starting from the joint distribution and applying recursively Bayes' theorem we obtain:
This is an exact mathematical expression.
It can be drastically simplified by assuming that the probability of appearance of a word knowing the nature of the text (spam or not) is independent of the appearance of the other words. This is the naive Bayes assumption and this makes this spam filter a naive Bayes model.
For instance, the programmer can assume that:
to finally obtain:
This kind of assumption is known as the naive Bayes' assumption. It is "naive" in the sense that the independence between words is clearly not completely true. For instance, it completely neglects that the appearance of pairs of words may be more significant than isolated appearances. However, the programmer may assume this hypothesis and may develop the model and the associated inferences to test how reliable and efficient it is.
To be able to compute the joint distribution, the programmer must now specify the
distributions appearing in the decomposition:
where stands for the number of appearances of the
word in non-spam e-mails and
stands for the total number of non-spam e-mails. Similarly,
stands for the number of appearances of the
word in spam e-mails and
stands for the total number of spam e-mails.
The forms
are not yet completely specified because the
parameters
,
,
and
have no values yet.
The identification of these parameters could be done either by batch processing a series of classified e-mails or by an incremental updating of the parameters using the user's classifications of the e-mails as they arrive.
Both methods could be combined: the system could start with initial standard values of these parameters issued from a generic database, then some incremental learning customizes the classifier to each individual user.
The question asked to the program is: "what is the probability for a given text to be spam knowing which words appear and don't appear in this text?" It can be formalized by:
which can be computed as follows:
The denominator appears to be a normalization constant. It is not necessary to compute it to decide if we are dealing with spam. For instance, an easy trick is to compute the ratio:
This computation is faster and easier because it requires only products.
The Bayesian spam filter program is completely defined by:
Bayesian filters (often called Recursive Bayesian estimation) are generic probabilistic models for time evolving processes. Numerous models are particular instances of this generic approach, for instance: the Kalman filter or the Hidden Markov model (HMM).
The decomposition is based:
The parametrical forms are not constrained and different choices lead to different well-known models: see Kalman filters and Hidden Markov models just below.
The typical question for such models is : what is the probability distribution for the state at time
knowing the observations from instant
to
?
The most common case is Bayesian filtering where , which searches for the present state, knowing past observations.
However, it is also possible , to extrapolate a future state from past observations, or to do smoothing
, to recover a past state from observations made either before or after that instant.
More complicated questions may also be asked as shown below in the HMM section.
Bayesian filters have a very interesting recursive property, which contributes greatly to their attractiveness.
may be computed simply from
with the following formula:
Another interesting point of view for this equation is to consider that there are two phases: a prediction phase and an estimation phase:
The very well-known Kalman filters[6] are a special case of Bayesian filters.
They are defined by the following Bayesian program:
With these hypotheses and by using the recursive formula, it is possible to solve
the inference problem analytically to answer the usual question.
This leads to an extremely efficient algorithm, which explains the popularity of Kalman filters and the number of their everyday applications.
When there are no obvious linear transition and observation models, it is still often possible, using a first-order Taylor's expansion, to treat these models as locally linear. This generalization is commonly called the extended Kalman filter.
Hidden Markov models (HMMs) are another very popular specialization of Bayesian filters.
They are defined by the following Bayesian program:
both specified using probability matrices.
What is the most probable series of states that leads to the present state, knowing the past observations?
This particular question may be answered with a specific and very efficient algorithm called the Viterbi algorithm.
The Baum–Welch algorithm has been developed for HMMs.
Since 2000, Bayesian programming has been used to develop both robotics applications and life sciences models.[7]
In robotics, bayesian programming was applied to autonomous robotics,[8][9][10][11][12] robotic CAD systems,[13] advanced driver-assistance systems,[14] robotic arm control, mobile robotics,[15][16] human-robot interaction,[17] human-vehicle interaction (Bayesian autonomous driver models)[18][19][20][21][22] video game avatar programming and training [23] and real-time strategy games (AI).[24]
In life sciences, bayesian programming was used in vision to reconstruct shape from motion,[25] to model visuo-vestibular interaction[26] and to study saccadic eye movements;[27] in speech perception and control to study early speech acquisition[28] and the emergence of articulatory-acoustic systems;[29] and to model handwriting perception and control.[30]
Bayesian program learning has potential applications voice recognition and synthesis, image recognition and natural language processing. It employs the principles of compositionality (building abstract representations from parts), causality (building complexity from parts) and learning to learn (using previously recognized concepts to ease the creation of new concepts).[31]
The comparison between probabilistic approaches (not only bayesian programming) and possibility theories continues to be debated.
Possibility theories like, for instance, fuzzy sets,[32] fuzzy logic[33] and possibility theory[34] are alternatives to probability to model uncertainty. They argue that probability is insufficient or inconvenient to model certain aspects of incomplete/uncertain knowledge.
The defense of probability is mainly based on Cox's theorem, which starts from four postulates concerning rational reasoning in the presence of uncertainty. It demonstrates that the only mathematical framework that satisfies these postulates is probability theory. The argument is that any approach other than probability necessarily infringes one of these postulates and the value of that infringement.
The purpose of probabilistic programming is to unify the scope of classical programming languages with probabilistic modeling (especially bayesian networks) to deal with uncertainty while profiting from the programming languages' expressiveness to encode complexity.
Extended classical programming languages include logical languages as proposed in Probabilistic Horn Abduction,[35] Independent Choice Logic,[36] PRISM,[37] and ProbLog which proposes an extension of Prolog.
It can also be extensions of functional programming languages (essentially Lisp and Scheme) such as IBAL or CHURCH. The underlying programming languages can be object-oriented as in BLOG and FACTORIE or more standard ones as in CES and FIGARO.[38]
The purpose of Bayesian programming is different. Jaynes' precept of "probability as logic" argues that probability is an extension of and an alternative to logic above which a complete theory of rationality, computation and programming can be rebuilt.[1] Bayesian programming attempts to replace classical languages with a programming approach based on probability that considers incompleteness and uncertainty.
The precise comparison between the semantics and power of expression of Bayesian and probabilistic programming is an open question.