ц жр п л вь сж л ж н ж п арь ж ь срв

advertisement
A Kernel Test for Three Variable Interations
Dino Sejdinovi1 , Arthur Gretton1 , Wiher Bergsma2
2
1 Gatsby Unit, CSML, University College London
Department of Statistis, London Shool of Eonomis
NIPS, 07 De 2013
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
1 / 17
Deteting a higher order interation
How to detet V-strutures with pairwise weak (or nonexistent)
dependene?
Y
X
Z
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
2 / 17
Deteting a higher order interation
How to detet V-strutures with pairwise weak (or nonexistent)
dependene?
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
2 / 17
Deteting a higher order interation
How to detet V-strutures with pairwise weak (or nonexistent)
dependene?
X
⊥
⊥
Y, Y
⊥
⊥
Z, X
X vs Y
⊥
⊥
Z
Y
X
Y vs Z
Z
X vs Z
XY vs Z
i .d .
N (0, 1),
X , Y i .∼
Z | X , Y ∼ (XY )Exp( √12 )
sign
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
2 / 17
Deteting pairwise dependene
How to detet dependene in a non-Eulidean / strutured domain?
Y
X
1 : Honourable senators, I have a question
for the Leader of the Government in the
Senate with regard to the support funding
to farmers that has been announed. Most
farmers have not reeived any money yet.
Y
Xprovinial
2 : No doubt there is great pressure on
and muniipal governments in
relation to the issue of hild are, but the
reality is that there have been no uts to
hild are funding from the federal
government to the provines. In fat, we
have inreased federal investments for early
hildhood development.
···
1 : Honorables sénateurs, ma question
s'adresse au leader du gouvernement au
Sénat et onerne l'aide naniére qu'on a
annonée pour les agriulteurs. La plupart
des agriulteurs n'ont enore rien reçu de
et argent.
2 : Il est évident que les ordres de
gouvernements proviniaux et muniipaux
subissent de fortes pressions en e qui
onerne les servies de garde, mais le
gouvernement n'a pas réduit le nanement
qu'il verse aux provines pour les servies de
garde. Au ontraire, nous avons augmenté le
nanement fédéral pour le développement
des jeunes enfants.
?
⇐⇒
···
Are the Frenh text extrats translations of the English ones?
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
3 / 17
Deteting pairwise dependene
−→ K=
−→ L=
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
4 / 17
Deteting pairwise dependene
−→ K=
hH KH , H LH i =
(H KH ◦ H LH )++
H = I − n1 11⊤
(entering matrix)
A++ =P
P
n
n
−→ L=
D. Sejdinovi (CSML, UCL)
i =1
Three-variable tests
j =1 Aij
NIPS, 07 De 2013
4 / 17
Kernel Embedding
feature map: z 7→ k (·, z ) ∈ Hk
instead of z 7→ (ϕ1 (z ), . . . , ϕs (z )) ∈ Rs
hk (·, z ), k (·, w )iHk = k (z , w )
inner produts easily omputed
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
5 / 17
Kernel Embedding
feature map: z 7→ k (·, z ) ∈ Hk
instead of z 7→ (ϕ1 (z ), . . . , ϕs (z )) ∈ Rs
hk (·, z ), k (·, w )iHk = k (z , w )
inner produts easily omputed
embedding: P 7→ µk (P ) = EZ ∼P k (·, Z ) ∈ Hk
instead of P 7→ (Eϕ1 (Z ), . . . , Eϕs (Z )) ∈ Rs
hµk (P ), µk (Q )iHk = EZ ∼P ,W ∼Q k (Z , W )
inner produts easily estimated
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
5 / 17
Independene test via embeddings
Maximum Mean Disrepany (MMD)
(Borgwardt et al, 2006; Gretton et al, 2007)
MMDk (P , Q ) = kµk (P ) − µk (Q )kHk
ISPD
:
kernels: µk injetive on all signed measures and
(Sriperumbudur, 2010)
MMDk
metri
Gaussian, Laplaian, inverse multiquadratis, Matérn et.
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
6 / 17
Independene test via embeddings
Maximum Mean Disrepany (MMD)
(Borgwardt et al, 2006; Gretton et al, 2007)
MMDk (P , Q ) = kµk (P ) − µk (Q )kHk
ISPD
:
kernels: µk injetive on all signed measures and
(Sriperumbudur, 2010)
MMDk
metri
Gaussian, Laplaian, inverse multiquadratis, Matérn et.
Hilbert-Shmidt Independene Criterion (HSIC)
Gretton et al (2005, 2008); Smola et al (2007)
2
kµκ (PXY ) − µκ (PX PY )kHκ
:
k(
!"
κ(
!"
k(
D. Sejdinovi (CSML, UCL)
Three-variable tests
,
#"
#"
!"
,
!"
,
#"
l(
)
#"
,
#"
)
)=
) × l(
NIPS, 07 De 2013
!"
!"
,
#"
6 / 17
)
Independene test via embeddings
Maximum Mean Disrepany (MMD)
(Borgwardt et al, 2006; Gretton et al, 2007)
MMDk (P , Q ) = kµk (P ) − µk (Q )kHk
ISPD
:
kernels: µk injetive on all signed measures and
(Sriperumbudur, 2010)
MMDk
metri
Gaussian, Laplaian, inverse multiquadratis, Matérn et.
Hilbert-Shmidt Independene Criterion (HSIC)
Gretton et al (2005, 2008); Smola et al (2007)
2
kµκ (PXY ) − µκ (PX PY )kHκ
:
Empirial HSIC= n12 (H KH ◦ H LH )++
Powerful independene tests that generalize dCov
of Szekely et al (2007); DS et al (2013)
D. Sejdinovi (CSML, UCL)
Three-variable tests
k(
!"
κ(
!"
k(
,
#"
#"
!"
,
!"
,
#"
l(
)
#"
,
#"
)
)=
) × l(
NIPS, 07 De 2013
!"
!"
,
#"
6 / 17
)
V-struture Disovery
Y
X
Z
Assume X ⊥
⊥ Y has been established. V-struture an then be deteted
by:
CI test: H0 : X ⊥
⊥ Y |Z (Zhang et al 2011) or
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
7 / 17
V-struture Disovery
Y
X
Z
Assume X ⊥
⊥ Y has been established. V-struture an then be deteted
by:
CI test: H0 : X ⊥
⊥ Y |Z (Zhang et al 2011) or
Fatorisation test: H0 : (X , Y ) ⊥
⊥ Z ∨ (X , Z ) ⊥
⊥ Y ∨ (Y , Z ) ⊥
⊥X
(multiple standard two-variable tests)
ompute
p-values for eah of the marginal tests for (Y , Z ) ⊥⊥ X ,
Y , or (X , Y ) ⊥⊥ Z
(X , Z ) ⊥
⊥
apply Holm-Bonferroni (HB) sequentially rejetive orretion
(Holm 1979)
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
7 / 17
V-struture Disovery (2)
How to detet V-strutures with pairwise weak (or nonexistent)
dependene?
X
⊥
⊥
Y, Y
⊥
⊥
Z, X
X1 vs Y1
⊥
⊥
Z
Y
X
Y1 vs Z1
Z
X1 vs Z1
X1*Y1 vs Z1
i .d .
X1 , Y1 i .∼
N (0, 1),
Z1 | X1 , Y1 ∼
(X1 Y1 )Exp ( √12 )
sign
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
8 / 17
V-struture Disovery (2)
How to detet V-strutures with pairwise weak (or nonexistent)
dependene?
X
⊥
⊥
Y, Y
⊥
⊥
Z, X
X1 vs Y1
⊥
⊥
Z
Y
X
Y1 vs Z1
Z
X1 vs Z1
X1*Y1 vs Z1
i .d .
X1 , Y1 i .∼
N (0, 1),
Z1 | X1 , Y1 ∼
(X1 Y1 )Exp ( √12 )
i .d .
N (0, Ip −1 )
X2:p , Y2:p , Z2:p i .∼
sign
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
8 / 17
Null aeptane rate (Type II error)
V-struture Disovery (3)
V-struture disovery: Dataset A
1
0.8
0.6
0.4
2var: Fator
0.2
CI:
0
1
3
5
7
9
11
13
15
X ⊥⊥ Y |Z
17
19
Dimension
Figure: CI test for
X
Y |Z from Zhang et al (2011), and a fatorisation test
n = 500
⊥
⊥
with a HB orretion,
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
9 / 17
Lanaster Interation Measure
Denition (Bahadur (1961); Lanaster (1969))
of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that
vanishes whenever P an be fatorised in a non-trivial way as a produt of
its (possibly multivariate) marginal distributions.
Interation measure
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
10 / 17
Lanaster Interation Measure
Denition (Bahadur (1961); Lanaster (1969))
of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that
vanishes whenever P an be fatorised in a non-trivial way as a produt of
its (possibly multivariate) marginal distributions.
Interation measure
D=2:
∆L P = PXY − PX PY
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
10 / 17
Lanaster Interation Measure
Denition (Bahadur (1961); Lanaster (1969))
of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that
vanishes whenever P an be fatorised in a non-trivial way as a produt of
its (possibly multivariate) marginal distributions.
Interation measure
D=2:
D=3:
∆L P = PXY − PX PY
∆L P = PXYZ − PX PYZ − PY PXZ − PZ PXY + 2PX PY PZ
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
10 / 17
Lanaster Interation Measure
Denition (Bahadur (1961); Lanaster (1969))
of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that
vanishes whenever P an be fatorised in a non-trivial way as a produt of
its (possibly multivariate) marginal distributions.
Interation measure
D=2:
D=3:
∆L P = PXY − PX PY
∆L P = PXYZ − PX PYZ − PY PXZ − PZ PXY + 2PX PY PZ
∆L P =
PXY Z
−PX PY Z
Z
D. Sejdinovi (CSML, UCL)
Y
X
+2PX PY PZ
−PZ PXY
−PY PXZ
Y
X
Z
Three-variable tests
Y
X
Z
Y
X
Z
NIPS, 07 De 2013
10 / 17
Lanaster Interation Measure
Denition (Bahadur (1961); Lanaster (1969))
of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that
vanishes whenever P an be fatorised in a non-trivial way as a produt of
its (possibly multivariate) marginal distributions.
Interation measure
D=2:
D=3:
∆L P = PXY − PX PY
∆L P = PXYZ − PX PYZ − PY PXZ − PZ PXY + 2PX PY PZ
∆L P = 0
PXY Z
−PX PY Z
Z
D. Sejdinovi (CSML, UCL)
Y
X
+2PX PY PZ
−PZ PXY
−PY PXZ
Y
X
Z
Three-variable tests
Y
X
Z
Y
X
Z
NIPS, 07 De 2013
10 / 17
A Test using Lanaster Measure
Construt a test by estimating kµκ (∆L P )k2Hκ ,
where
κ = k ⊗ l ⊗ m:
kµκ (PXYZ − PXY PZ − · · · )k2Hκ =
hµκ PXYZ , µκ PXYZ iHκ − 2 hµκ PXYZ , µκ PXY PZ iHκ · · ·
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
11 / 17
Inner Produt Estimators
ν\ν ′
PXYZ
(
PXYZ
PXY PZ
PXZ PY
PYZ PX
PX PY PZ
K ◦ L ◦ M)++
K ◦ L) M)++
(K ◦ L)++ M++
K ◦ M) L)++
(MKL)++
(K ◦ M)++ L++
M ◦ L) K)++
(KLM)++
(KML)++
(L ◦ M)++ K++
tr (K+ ◦ L+ ◦ M+ )
PXY PZ
PXZ PY
((
((
PYZ PX
((
PX PY PZ
Table:
D. Sejdinovi (CSML, UCL)
V -statisti estimators of hµκ ν, µκ ν ′ iH
Three-variable tests
KL)++ M++
KM)++ L++
(LM)++ K++
K++ L++ M++
(
(
κ
NIPS, 07 De 2013
12 / 17
Inner Produt Estimators
ν\ν ′
PXYZ
(
PXYZ
PXY PZ
PXZ PY
PYZ PX
PX PY PZ
K ◦ L ◦ M)++
K ◦ L) M)++
(K ◦ L)++ M++
K ◦ M) L)++
(MKL)++
(K ◦ M)++ L++
M ◦ L) K)++
(KLM)++
(KML)++
(L ◦ M)++ K++
tr (K+ ◦ L+ ◦ M+ )
PXY PZ
((
PXZ PY
((
PYZ PX
((
PX PY PZ
Table:
V -statisti estimators of hµκ ν, µκ ν ′ iH
KL)++ M++
KM)++ L++
(LM)++ K++
K++ L++ M++
(
(
κ
Proposition (Lanaster interation statisti)
kµκ (∆L P )k2Hκ =
1
n2 (H KH ◦ H LH ◦ H MH )++ .
Empirial joint entral moment in the feature spae
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
12 / 17
Null aeptane rate (Type II error)
Example A: fatorisation tests
V-struture disovery: Dataset A
1
0.8
0.6
0.4
2var: Fator
∆L : Fator
0.2
CI:
0
1
3
5
7
9
11
13
15
X ⊥⊥ Y |Z
17
19
Dimension
Figure: Fatorisation hypothesis: Lanaster statisti vs. a two-variable based test
(both with HB orretion); Test for
D. Sejdinovi (CSML, UCL)
X
⊥
⊥
Y |Z
Three-variable tests
from Zhang et al (2011),
n = 500
NIPS, 07 De 2013
13 / 17
Example B: Joint dependene an be easier to detet
i .d . N (0, 1)
X1 , Y1 i .∼

2

w .p. 1/3,
X1 + ǫ,
2
Z1 = Y1 + ǫ, w .p. 1/3, where ǫ ∼ N (0, 0.12 ).

X1 Y1 + ǫ, w .p. 1/3,
i .d .
X2:p , Y2:p , Z2:p i .∼
N (0, Ip −1 )
dependene of Z on pair (X , Y ) is stronger than on X and Y
individually
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
14 / 17
Null aeptane rate (Type II error)
Example B: fatorisation tests
V-struture disovery: Dataset B
1
0.8
0.6
0.4
2var: Fator
∆L : Fator
0.2
CI:
0
1
3
5
7
9
11
13
15
X ⊥⊥ Y |Z
17
19
Dimension
Figure: Fatorisation hypothesis: Lanaster statisti vs. a two-variable based test
(both with HB orretion); Test for
D. Sejdinovi (CSML, UCL)
X
⊥
⊥
Y |Z
Three-variable tests
from Zhang et al (2011),
n = 500
NIPS, 07 De 2013
15 / 17
Interation for
D≥4
Interation measure valid for all
(Streitberg, 1990):
∆S P =
D
X
(−1)|π|−1 (|π| − 1)!Jπ P
π
For a partition
π,
Jπ
assoiates to the
joint the orresponding fatorisation,
e.g.,
J13|2|4 P = PX1 X3 PX2 PX4 .
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
16 / 17
Interation for
D≥4
Interation measure valid for all
(Streitberg, 1990):
∆S P =
D
X
(−1)|π|−1 (|π| − 1)!Jπ P
π
For a partition
π,
Jπ
assoiates to the
joint the orresponding fatorisation,
e.g.,
J13|2|4 P = PX1 X3 PX2 PX4 .
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
16 / 17
D≥4
Interation measure valid for all
(Streitberg, 1990):
∆S P
D
X
=
(−1)|π|−1 (|π| − 1)!Jπ P
π
For a partition
π,
Jπ
J13|2|4 P = PX1 X3 PX2 PX4 .
D. Sejdinovi (CSML, UCL)
Bell numbers growth
1e+19
1e+14
1e+09
1e+04
assoiates to the
joint the orresponding fatorisation,
e.g.,
Number of partitions of {1,...,D}
Interation for
Three-variable tests
1 3 5 7 9 11 13 15 17 19 21 23 25
D
NIPS, 07 De 2013
16 / 17
D≥4
Interation measure valid for all
(Streitberg, 1990):
∆S P
D
X
=
(−1)|π|−1 (|π| − 1)!Jπ P
π
For a partition
π,
Jπ
Bell numbers growth
1e+19
1e+14
1e+09
1e+04
assoiates to the
joint the orresponding fatorisation,
e.g.,
Number of partitions of {1,...,D}
Interation for
J13|2|4 P = PX1 X3 PX2 PX4 .
1 3 5 7 9 11 13 15 17 19 21 23 25
D
(Lanaster interation)
vs.
joint umulants (Streitberg interation)
joint entral moments
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
16 / 17
Summary
A nonparametri test for three-variable interation and for total
independene, using embeddings of signed measures into RKHSs
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
17 / 17
Summary
A nonparametri test for three-variable interation and for total
independene, using embeddings of signed measures into RKHSs
Test statistis are simple and easy to ompute - orresponding
permutation tests signiantly outperform standard two-variable-based
tests on V-strutures with weak pairwise interations
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
17 / 17
Summary
A nonparametri test for three-variable interation and for total
independene, using embeddings of signed measures into RKHSs
Test statistis are simple and easy to ompute - orresponding
permutation tests signiantly outperform standard two-variable-based
tests on V-strutures with weak pairwise interations
All forms of Lanaster three-variable interation an be deteted for a
large family of reproduing kernels (ISPD)
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
17 / 17
Summary
A nonparametri test for three-variable interation and for total
independene, using embeddings of signed measures into RKHSs
Test statistis are simple and easy to ompute - orresponding
permutation tests signiantly outperform standard two-variable-based
tests on V-strutures with weak pairwise interations
All forms of Lanaster three-variable interation an be deteted for a
large family of reproduing kernels (ISPD)
Thank You!
Poster S6
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
17 / 17
Referenes
B. Streitberg, Lanaster
18781885, 1990.
A. Kankainen.
interations revisited. Annals of Statistis
18(4):
Consistent Testing of Total Independene Based on the Empirial
Charateristi Funtion
. PhD thesis, University of Jyväskylä, 1995.
A. Gretton, K. Fukumizu, C.-H. Teo, L. Song, B. Shölkopf and A. Smola.
kernel statistial test of independene. in Advanes in Neural Information
Proessing Systems 20: 585592, MIT Press, 2008.
A
B. Sriperumbudur, A. Gretton, K. Fukumizu, G. Lankriet and B. Shölkopf.
Hilbert spae embeddings and metris on probability measures. J. Mah. Learn.
Res.
11: 15171561, 2010.
D. Sejdinovi, B. Sriperumbudur, A. Gretton and K. Fukumizu,
Equivalene of
distane-based and RKHS-based statistis in hypothesis testing. Annals of
Statistis
41(5): 2263-2291, 2013.
D. Sejdinovi (CSML, UCL)
Three-variable tests
NIPS, 07 De 2013
18 / 17
Download