J u m p t o c o n t e n t
M a i n m e n u
M a i n m e n u
N a v i g a t i o n
● M a i n p a g e
● C o n t e n t s
● C u r r e n t e v e n t s
● R a n d o m a r t i c l e
● A b o u t W i k i p e d i a
● C o n t a c t u s
● D o n a t e
C o n t r i b u t e
● H e l p
● L e a r n t o e d i t
● C o m m u n i t y p o r t a l
● R e c e n t c h a n g e s
● U p l o a d f i l e
S e a r c h
Search
A p p e a r a n c e
● C r e a t e a c c o u n t
● L o g i n
P e r s o n a l t o o l s
● C r e a t e a c c o u n t
● L o g i n
P a g e s f o r l o g g e d o u t e d i t o r s l e a r n m o r e
● C o n t r i b u t i o n s
● T a l k
( T o p )
1
D e f i n i t i o n
T o g g l e D e f i n i t i o n s u b s e c t i o n
1 . 1
P r o o f
2
P r o p e r t i e s
T o g g l e P r o p e r t i e s s u b s e c t i o n
2 . 1
E x p e c t e d v a l u e s
2 . 2
T r a n s f o r m a t i o n
3
E x a m p l e
4
M a x i m u m l i k e l i h o o d p a r a m e t e r e s t i m a t i o n
5
D r a w i n g v a l u e s f r o m t h e d i s t r i b u t i o n
6
R e l a t i o n t o o t h e r d i s t r i b u t i o n s
7
S e e a l s o
8
R e f e r e n c e s
T o g g l e t h e t a b l e o f c o n t e n t s
M a t r i x n o r m a l d i s t r i b u t i o n
3 l a n g u a g e s
● C a t a l à
● T ü r k ç e
● 中 文
E d i t l i n k s
● A r t i c l e
● T a l k
E n g l i s h
● R e a d
● E d i t
● V i e w h i s t o r y
T o o l s
T o o l s
A c t i o n s
● R e a d
● E d i t
● V i e w h i s t o r y
G e n e r a l
● W h a t l i n k s h e r e
● R e l a t e d c h a n g e s
● U p l o a d f i l e
● S p e c i a l p a g e s
● P e r m a n e n t l i n k
● P a g e i n f o r m a t i o n
● C i t e t h i s p a g e
● G e t s h o r t e n e d U R L
● D o w n l o a d Q R c o d e
● W i k i d a t a i t e m
P r i n t / e x p o r t
● D o w n l o a d a s P D F
● P r i n t a b l e v e r s i o n
A p p e a r a n c e
F r o m W i k i p e d i a , t h e f r e e e n c y c l o p e d i a
In statistics , the matrix normal distribution or matrix Gaussian distribution is a probability distribution that is a generalization of the multivariate normal distribution to matrix-valued random variables.
Definition [ edit ]
The probability density function for the random matrix X (n × p ) that follows the matrix normal distribution
M
N
n
,
p
(
M
,
U
,
V
)
{\displaystyle {\mathcal {MN}}_{n,p}(\mathbf {M} ,\mathbf {U} ,\mathbf {V} )}
has the form:
p
(
X
∣
M
,
U
,
V
)
=
exp
(
−
1
2
t
r
[
V
−
1
(
X
−
M
)
T
U
−
1
(
X
−
M
)
]
)
(
2
π
)
n
p
/
2
|
V
|
n
/
2
|
U
|
p
/
2
{\displaystyle p(\mathbf {X} \mid \mathbf {M} ,\mathbf {U} ,\mathbf {V} )={\frac {\exp \left(-{\frac {1}{2}}\,\mathrm {tr} \left[\mathbf {V} ^{-1}(\mathbf {X} -\mathbf {M} )^{T}\mathbf {U} ^{-1}(\mathbf {X} -\mathbf {M} )\right]\right)}{(2\pi )^{np/2}|\mathbf {V} |^{n/2}|\mathbf {U} |^{p/2}}}}
where
t
r
{\displaystyle \mathrm {tr} }
denotes trace and M is n × p , U is n × n and V is p × p , and the density is understood as the probability density function with respect to the standard Lebesgue measure in
R
n
×
p
{\displaystyle \mathbb {R} ^{n\times p}}
, i.e.: the measure corresponding to integration with respect to
d
x
11
d
x
21
…
d
x
n
1
d
x
12
…
d
x
n
2
…
d
x
n
p
{\displaystyle dx_{11}dx_{21}\dots dx_{n1}dx_{12}\dots dx_{n2}\dots dx_{np}}
.
The matrix normal is related to the multivariate normal distribution in the following way:
X
∼
M
N
n
×
p
(
M
,
U
,
V
)
,
{\displaystyle \mathbf {X} \sim {\mathcal {MN}}_{n\times p}(\mathbf {M} ,\mathbf {U} ,\mathbf {V} ),}
if and only if
v
e
c
(
X
)
∼
N
n
p
(
v
e
c
(
M
)
,
V
⊗
U
)
{\displaystyle \mathrm {vec} (\mathbf {X} )\sim {\mathcal {N}}_{np}(\mathrm {vec} (\mathbf {M} ),\mathbf {V} \otimes \mathbf {U} )}
where
⊗
{\displaystyle \otimes }
denotes the Kronecker product and
v
e
c
(
M
)
{\displaystyle \mathrm {vec} (\mathbf {M} )}
denotes the vectorization of
M
{\displaystyle \mathbf {M} }
.
The equivalence between the above matrix normal and multivariate normal density functions can be shown using several properties of the trace and Kronecker product , as follows. We start with the argument of the exponent of the matrix normal PDF:
−
1
2
tr
[
V
−
1
(
X
−
M
)
T
U
−
1
(
X
−
M
)
]
=
−
1
2
vec
(
X
−
M
)
T
vec
(
U
−
1
(
X
−
M
)
V
−
1
)
=
−
1
2
vec
(
X
−
M
)
T
(
V
−
1
⊗
U
−
1
)
vec
(
X
−
M
)
=
−
1
2
[
vec
(
X
)
−
vec
(
M
)
]
T
(
V
⊗
U
)
−
1
[
vec
(
X
)
−
vec
(
M
)
]
{\displaystyle {\begin{aligned}&\;\;\;\;-{\frac {1}{2}}{\text{tr}}\left[\mathbf {V} ^{-1}(\mathbf {X} -\mathbf {M} )^{T}\mathbf {U} ^{-1}(\mathbf {X} -\mathbf {M} )\right]\\&=-{\frac {1}{2}}{\text{vec}}\left(\mathbf {X} -\mathbf {M} \right)^{T}{\text{vec}}\left(\mathbf {U} ^{-1}(\mathbf {X} -\mathbf {M} )\mathbf {V} ^{-1}\right)\\&=-{\frac {1}{2}}{\text{vec}}\left(\mathbf {X} -\mathbf {M} \right)^{T}\left(\mathbf {V} ^{-1}\otimes \mathbf {U} ^{-1}\right){\text{vec}}\left(\mathbf {X} -\mathbf {M} \right)\\&=-{\frac {1}{2}}\left[{\text{vec}}(\mathbf {X} )-{\text{vec}}(\mathbf {M} )\right]^{T}\left(\mathbf {V} \otimes \mathbf {U} \right)^{-1}\left[{\text{vec}}(\mathbf {X} )-{\text{vec}}(\mathbf {M} )\right]\end{aligned}}}
which is the argument of the exponent of the multivariate normal PDF with respect to Lebesgue measure in
R
n
p
{\displaystyle \mathbb {R} ^{np}}
. The proof is completed by using the determinant property:
|
V
⊗
U
|
=
|
V
|
n
|
U
|
p
.
{\displaystyle |\mathbf {V} \otimes \mathbf {U} |=|\mathbf {V} |^{n}|\mathbf {U} |^{p}.}
Properties [ edit ]
If
X
∼
M
N
n
×
p
(
M
,
U
,
V
)
{\displaystyle \mathbf {X} \sim {\mathcal {MN}}_{n\times p}(\mathbf {M} ,\mathbf {U} ,\mathbf {V} )}
, then we have the following properties:[1] [2]
Expected values [ edit ]
The mean, or expected value is:
E
[
X
]
=
M
{\displaystyle E[\mathbf {X} ]=\mathbf {M} }
and we have the following second-order expectations:
E
[
(
X
−
M
)
(
X
−
M
)
T
]
=
U
tr
(
V
)
{\displaystyle E[(\mathbf {X} -\mathbf {M} )(\mathbf {X} -\mathbf {M} )^{T}]=\mathbf {U} \operatorname {tr} (\mathbf {V} )}
E
[
(
X
−
M
)
T
(
X
−
M
)
]
=
V
tr
(
U
)
{\displaystyle E[(\mathbf {X} -\mathbf {M} )^{T}(\mathbf {X} -\mathbf {M} )]=\mathbf {V} \operatorname {tr} (\mathbf {U} )}
where
tr
{\displaystyle \operatorname {tr} }
denotes trace .
More generally, for appropriately dimensioned matrices A ,B ,C :
E
[
X
A
X
T
]
=
U
tr
(
A
T
V
)
+
M
A
M
T
E
[
X
T
B
X
]
=
V
tr
(
U
B
T
)
+
M
T
B
M
E
[
X
C
X
]
=
V
C
T
U
+
M
C
M
{\displaystyle {\begin{aligned}E[\mathbf {X} \mathbf {A} \mathbf {X} ^{T}]&=\mathbf {U} \operatorname {tr} (\mathbf {A} ^{T}\mathbf {V} )+\mathbf {MAM} ^{T}\\E[\mathbf {X} ^{T}\mathbf {B} \mathbf {X} ]&=\mathbf {V} \operatorname {tr} (\mathbf {U} \mathbf {B} ^{T})+\mathbf {M} ^{T}\mathbf {BM} \\E[\mathbf {X} \mathbf {C} \mathbf {X} ]&=\mathbf {V} \mathbf {C} ^{T}\mathbf {U} +\mathbf {MCM} \end{aligned}}}
Transformation [ edit ]
Transpose transform:
X
T
∼
M
N
p
×
n
(
M
T
,
V
,
U
)
{\displaystyle \mathbf {X} ^{T}\sim {\mathcal {MN}}_{p\times n}(\mathbf {M} ^{T},\mathbf {V} ,\mathbf {U} )}
Linear transform: let D (r -by-n ), be of full rank r ≤ n and C (p -by-s ), be of full rank s ≤ p , then:
D
X
C
∼
M
N
r
×
s
(
D
M
C
,
D
U
D
T
,
C
T
V
C
)
{\displaystyle \mathbf {DXC} \sim {\mathcal {MN}}_{r\times s}(\mathbf {DMC} ,\mathbf {DUD} ^{T},\mathbf {C} ^{T}\mathbf {VC} )}
Example [ edit ]
Let's imagine a sample of n independent p -dimensional random variables identically distributed according to a multivariate normal distribution :
Y
i
∼
N
p
(
μ
,
Σ
)
with
i
∈
{
1
,
…
,
n
}
{\displaystyle \mathbf {Y} _{i}\sim {\mathcal {N}}_{p}({\boldsymbol {\mu }},{\boldsymbol {\Sigma }}){\text{ with }}i\in \{1,\ldots ,n\}}
.
When defining the n × p matrix
X
{\displaystyle \mathbf {X} }
for which the i th row is
Y
i
{\displaystyle \mathbf {Y} _{i}}
, we obtain:
X
∼
M
N
n
×
p
(
M
,
U
,
V
)
{\displaystyle \mathbf {X} \sim {\mathcal {MN}}_{n\times p}(\mathbf {M} ,\mathbf {U} ,\mathbf {V} )}
where each row of
M
{\displaystyle \mathbf {M} }
is equal to
μ
{\displaystyle {\boldsymbol {\mu }}}
, that is
M
=
1
n
×
μ
T
{\displaystyle \mathbf {M} =\mathbf {1} _{n}\times {\boldsymbol {\mu }}^{T}}
,
U
{\displaystyle \mathbf {U} }
is the n × n identity matrix, that is the rows are independent, and
V
=
Σ
{\displaystyle \mathbf {V} ={\boldsymbol {\Sigma }}}
.
Maximum likelihood parameter estimation [ edit ]
Given k matrices, each of size n × p , denoted
X
1
,
X
2
,
…
,
X
k
{\displaystyle \mathbf {X} _{1},\mathbf {X} _{2},\ldots ,\mathbf {X} _{k}}
, which we assume have been sampled i.i.d. from a matrix normal distribution, the maximum likelihood estimate of the parameters can be obtained by maximizing:
∏
i
=
1
k
M
N
n
×
p
(
X
i
∣
M
,
U
,
V
)
.
{\displaystyle \prod _{i=1}^{k}{\mathcal {MN}}_{n\times p}(\mathbf {X} _{i}\mid \mathbf {M} ,\mathbf {U} ,\mathbf {V} ).}
The solution for the mean has a closed form, namely
M
=
1
k
∑
i
=
1
k
X
i
{\displaystyle \mathbf {M} ={\frac {1}{k}}\sum _{i=1}^{k}\mathbf {X} _{i}}
but the covariance parameters do not. However, these parameters can be iteratively maximized by zero-ing their gradients at:
U
=
1
k
p
∑
i
=
1
k
(
X
i
−
M
)
V
−
1
(
X
i
−
M
)
T
{\displaystyle \mathbf {U} ={\frac {1}{kp}}\sum _{i=1}^{k}(\mathbf {X} _{i}-\mathbf {M} )\mathbf {V} ^{-1}(\mathbf {X} _{i}-\mathbf {M} )^{T}}
and
V
=
1
k
n
∑
i
=
1
k
(
X
i
−
M
)
T
U
−
1
(
X
i
−
M
)
,
{\displaystyle \mathbf {V} ={\frac {1}{kn}}\sum _{i=1}^{k}(\mathbf {X} _{i}-\mathbf {M} )^{T}\mathbf {U} ^{-1}(\mathbf {X} _{i}-\mathbf {M} ),}
See for example [3] and references therein. The covariance parameters are non-identifiable in the sense that for any scale factor, s >0, we have:
M
N
n
×
p
(
X
∣
M
,
U
,
V
)
=
M
N
n
×
p
(
X
∣
M
,
s
U
,
1
s
V
)
.
{\displaystyle {\mathcal {MN}}_{n\times p}(\mathbf {X} \mid \mathbf {M} ,\mathbf {U} ,\mathbf {V} )={\mathcal {MN}}_{n\times p}(\mathbf {X} \mid \mathbf {M} ,s\mathbf {U} ,{\tfrac {1}{s}}\mathbf {V} ).}
Drawing values from the distribution [ edit ]
Sampling from the matrix normal distribution is a special case of the sampling procedure for the multivariate normal distribution . Let
X
{\displaystyle \mathbf {X} }
be an n by p matrix of np independent samples from the standard normal distribution, so that
X
∼
M
N
n
×
p
(
0
,
I
,
I
)
.
{\displaystyle \mathbf {X} \sim {\mathcal {MN}}_{n\times p}(\mathbf {0} ,\mathbf {I} ,\mathbf {I} ).}
Then let
Y
=
M
+
A
X
B
,
{\displaystyle \mathbf {Y} =\mathbf {M} +\mathbf {A} \mathbf {X} \mathbf {B} ,}
so that
Y
∼
M
N
n
×
p
(
M
,
A
A
T
,
B
T
B
)
,
{\displaystyle \mathbf {Y} \sim {\mathcal {MN}}_{n\times p}(\mathbf {M} ,\mathbf {AA} ^{T},\mathbf {B} ^{T}\mathbf {B} ),}
where A and B can be chosen by Cholesky decomposition or a similar matrix square root operation.
Relation to other distributions [ edit ]
Dawid (1981) provides a discussion of the relation of the matrix-valued normal distribution to other distributions, including the Wishart distribution , inverse-Wishart distribution and matrix t-distribution , but uses different notation from that employed here.
See also [ edit ]
References [ edit ]
^ Ding, Shanshan; R. Dennis Cook (2014). "DIMENSION FOLDING PCA AND PFC FOR MATRIX- VALUED PREDICTORS". Statistica Sinica . 24 (1 ): 463–492.
^ Glanz, Hunter; Carvalho, Luis (2013). "An Expectation-Maximization Algorithm for the Matrix Normal Distribution". arXiv :1309.6609 [stat.ME ].
R e t r i e v e d f r o m " https://en.wikipedia.org/w/index.php?title=Matrix_normal_distribution&oldid=1119144125 "
C a t e g o r i e s :
● R a n d o m m a t r i c e s
● C o n t i n u o u s d i s t r i b u t i o n s
● M u l t i v a r i a t e c o n t i n u o u s d i s t r i b u t i o n s
H i d d e n c a t e g o r i e s :
● A r t i c l e s w i t h s h o r t d e s c r i p t i o n
● S h o r t d e s c r i p t i o n m a t c h e s W i k i d a t a
● T h i s p a g e w a s l a s t e d i t e d o n 3 0 O c t o b e r 2 0 2 2 , a t 2 3 : 3 4 ( U T C ) .
● T e x t i s a v a i l a b l e u n d e r t h e C r e a t i v e C o m m o n s A t t r i b u t i o n - S h a r e A l i k e L i c e n s e 4 . 0 ;
a d d i t i o n a l t e r m s m a y a p p l y . B y u s i n g t h i s s i t e , y o u a g r e e t o t h e T e r m s o f U s e a n d P r i v a c y P o l i c y . W i k i p e d i a ® i s a r e g i s t e r e d t r a d e m a r k o f t h e W i k i m e d i a F o u n d a t i o n , I n c . , a n o n - p r o f i t o r g a n i z a t i o n .
● P r i v a c y p o l i c y
● A b o u t W i k i p e d i a
● D i s c l a i m e r s
● C o n t a c t W i k i p e d i a
● C o d e o f C o n d u c t
● D e v e l o p e r s
● S t a t i s t i c s
● C o o k i e s t a t e m e n t
● M o b i l e v i e w