The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.

On the Bethe approximation Adrian Weller Department of Statistics at Oxford University September 12, 2014 Joint work with Tony Jebara 1/46

Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by , 2017-05-18 02:10:04

On the Bethe approximation - Columbia University

On the Bethe approximation Adrian Weller Department of Statistics at Oxford University September 12, 2014 Joint work with Tony Jebara 1/46

Cycle polytope

A relaxation of the marginal
Inherits all constraints of the
tight
In addition, enforces consiste
Cycle inequalities [B93]

∀ cycles C and every subse

(µij (0, 0) + µij (1, 1))

(i ,j )∈F

Cycle polytope = marginal p
MRFs [B93]
Cycle polytope = TRI for bin

polytope
local polytope, hence at least as

ency around any cycle

et of edges F ⊆ C with |F | odd:
)+ (µij (1, 0) + µij (0, 1)) ≥ 1.

(i,j)∈C \F

olytope for symmetric planar

nary pairwise [S10]

42 / 46

Threshold for attractive mode

Bethe free energy E−S00
Bethe free energy E−SB
B−0.5 −0.2
Bethe entropy S
−0.4
B−1

−0.6

−1.5 −0.80
0 0.5 1
q

K5 : W = 1 W=

4

3

2

1

Energy E 0 1
0 0.5 1

q

W =1

2.5
2

1.5
1

0.5
0
0 0.5
q

els ξij (qi , qj , Wij )

Bethe free energy E−SB 0

−0.1

−0.2

−0.3

0.5 1 −0.40 0.5 1
q q 43 / 46

= 1.38 W = 1.75

0.4

Bethe entropy S 0.2

B 0

−0.2

−0.4 0.5 1
0 q 1

W = 4.5

2.5

2

Energy E 1.5

1

0.5

00 0.5
q

Experiments: Attractive mode

1 Bethe+local
0.8 Bethe+cycle
0.6 Bethe+marg
0.4 TRW+local
0.2 TRW+cycle
TRW+marg
0
0.4 2 4 8 12 16
Maximum coupling strength y

log partition error

For this distribution of models,
the polytope appears to make
no difference

Though recall we showed
theoretically it can

els θi ∼ [−0.1, 0.1]

0.5

0.4

0.3 Bethe+local
0.2 Bethe+cycle
0.1 Bethe+marg
TRW+local
0 TRW+cycle
0.4 TRW+marg

0.1 24 8 12 16

Maximum coupling strength y

Singleton marginals, average 1 error

0.08

0.06

Bethe+local

0.04 Bethe+cycle

Bethe+marg

0.02 TRW+local

TRW+cycle

TRW+marg

0 4 8 12 16
0.4 2 Maximum coupling strength y

Pairwise marginals, average 1 error (small scale)

44 / 46

Clamping variables: Attractive

ZB = optimal Bethe partition
Clamp variable Xi , form new
ZB(i) = ZB |Xi =0 + ZB |Xi =1.

Theorem (WJ14 NIPS)
For an attractive binary pairwise m
ZB ≤ ZB(i).

Corollary
For an attractive binary pairwise m

⇒ clamping only improves the est

e binary pairwise models

n function for original model
approximation

model and any variable Xi ,

model, ZB ≤ Z .
timate of the partition function.

45 / 46

Clamping variables: stronger r

For any i ∈ V, x ∈ [0, 1], let
log ZBi (x ) = maxq∈[0,1]n:qi =x
Observe log ZBi (0) = log ZB |
and log ZB = maxqi ∈[0,1] log Z
Recall Si (x) = −x log x − (1

result

−F (q)
|Xi =0, log ZBi (1) = log ZB |Xi =1
ZBi (qi )
− x) log(1 − x) singleton entropy

46 / 46

Clamping variables: stronger r

For any i ∈ V, x ∈ [0, 1], let
log ZBi (x ) = maxq∈[0,1]n:qi =x
Observe log ZBi (0) = log ZB |
and log ZB = maxqi ∈[0,1] log Z
Recall Si (x) = −x log x − (1
Lemma: To prove clamping r
log ZBi (qi ) ≤ qi log ZBi (1) +

result

−F (q)
|Xi =0, log ZBi (1) = log ZB |Xi =1
ZBi (qi )
− x) log(1 − x) singleton entropy
result, sufficient if
(1 − qi ) log ZBi (0) + Si (qi )

46 / 46

Clamping variables: stronger r

For any i ∈ V, x ∈ [0, 1], let
log ZBi (x ) = maxq∈[0,1]n:qi =x
Observe log ZBi (0) = log ZB |
and log ZB = maxqi ∈[0,1] log Z
Recall Si (x) = −x log x − (1
Lemma: To prove clamping r
log ZBi (qi ) ≤ qi log ZBi (1) +

Theorem (WJ14 NIPS)
For an attractive binary pairwise m
convex.

Uses earlier results on Hessia

result

−F (q)
|Xi =0, log ZBi (1) = log ZB |Xi =1
ZBi (qi )
− x) log(1 − x) singleton entropy
result, sufficient if
(1 − qi ) log ZBi (0) + Si (qi )

model, log ZBi (qi ) − Si (qi ) is

an

46 / 46


Click to View FlipBook Version