The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.

Dictionary-Based Disambiguation disambiguation-based on translations in a second-language (1/3) This method makes use of word correspondences in a bilingual dictionary.

Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by , 2017-01-23 21:40:03

Foundations of Statistical Natural Language Processing

Dictionary-Based Disambiguation disambiguation-based on translations in a second-language (1/3) This method makes use of word correspondences in a bilingual dictionary.

Foundations of S
Language P

Statistical Natural
Processing

劉成韋

OUTL

„ Introduction
„ Methodological Prelimin
„ Supervised Disambiguat
„ Dictionary-based Disam
„ Unsupervised Disambigu
„ What is a Word Sense?
„ Further Reading

LINE

naries
tion
mbiguation
uation

Introduct

„ Problem

„ A word is assumed to ha
senses.

„ Task

„ To make a forced choice
the meaning of each usa

„ Based on the context of

„ In fact

„ A word has various some
unclear whether to and w
them.

tion (1/2)

ave a finite number of discrete

e between these senses for
age of an ambiguous word.
use.
ewhat related senses, but it is
where to draws lines between

Introduct

„ However, the senses ar
defind.

„ For Example:bank

„ The rising ground border
坡)

„ As establishment for the
exchange, or issue of mo
credit, and for facilitating
funds.(銀行)

tion (2/2)

re not always so well-

ring a lake, river, or sea...(邊
custody(保管), loan
oney, for the extension of
g the transmission of

Methodological Pr

„ Supervised learning

„ Know the actual status fo
which one train

„ Can usually be seen as a

„ Unsupervised learning

„ Don’t know the classifica
training example

„ Can thus often be viewed

reliminaries (1/3)

or each piece of data on
a classification task
ation of the data in the
d as a clustering task.

Methodological Pr

Pseudo

„ For testing the performa

„ Large number of occurre
disambiguated by hand

„ Time intensive
„ Laborious task

„ Pseudo-words

„ Conflating two or more w
„ Such as replaces all bana

banana-door

reliminaries (2/3)

o-words
ance of these algorithms

ences has to be

words
ana and door in a corpus by

Methodological Pr

Upper and lower bou

„ The estimation of upper

„ A way to make sense of
„ A good idea for those wh

evaluation sets for comp

„ The upper bound used
performance

„ We can’t expect an autom

„ The lower bound is the
simplest possible algorit

„ Assign all contexts to the

reliminaries (3/3)

unds on performance
r and lower bounds

performance figures
hich have no standardized
paring systems.

is usually human

matic procedure to do better

performance of the
thm

e most frequent sense

Supervised Di

„ A disambiguated corpus

„ There is a training set wh
ambiguous word is anno

„ Bayesian classification

„ < Gale et al. 1992 >
„ Treats the context of occ

without structure

„ An information-theoretic

„ < Brown et al. 1991 >
„ Looks at only one inform

which may be sensitive t

isambiguation

s is available for training

here each occurrence of the
otated with a semantic label

currence as a bag of words

c approach

mative feature in the context,
to text structure

Supervised Di

Bayesian Class

„ Each context word

„ Contributes potentially u
sense of the ambiguous

„ A Bayes classifier applie
when choosing a class

„ For each cases, choose t
„ The rule minimize the pr

„ Bayes decision rule

P(s'| c) > P(sk | c) for sk ≠ s'

isambiguation

sification (1/4)

useful information about which
word is likely to be used with it

es the Bayes decision rule

the class with the highest prob.
robability of error

'

Supervised Di

Bayesian Class

„ We want to assign the a

sense s′ , given the con

s' = arg max P(sk | c)

sk

= arg max P(c | sk ) P(sk )
P(c)
sk

= arg max P(c | sk )P(sk )

sk

= arg max[log P(c | sk ) + log

sk

isambiguation

sification (2/4)
ambiguous word w to the

ntext c

Baye’s Rule

g P(sk )] log

Supervised Di

Bayesian Class

„ Gale et al.’s classifier, th

„ An instance of a particula

„ Naïve Bayes assumption

P(c | sk ) = P({vj | vj inc}| s

„ All the context and linear
„ Each word is independen

„ Actually it’s not true, such

„ The simplifying assumpti

isambiguation

sification (3/4)

he Naïve Bayes classifier

ar kind of Bayes classifier

n

∏sk ) = vjinc P(vj | sk )

r ordering of words is ignored
nt of another

h as “president”

ion makes it more effective

Supervised Di

Bayesian Class

„ With the Naïve Bayes as

„ Decision rule for Naïve B
„ decide s’ if

s'= argmaxsk [log

P(vj | sk ) = C(v j , sk ) P(s
C(sk )

„ Choose s ' = arg maxs

isambiguation

sification (4/4)

ssumption :

Bayes

∑gP(sk ) + vjinc logP(vj | sk )]

sk ) = C(sk )
C(w)

sk score(sk )

Supervised Di

An information-theo

„ It tries to find a single c
reliably indicates which
word is being used.

„ Instead of use informatio
context, such as Bayes c

isambiguation

oretic approach (1/5)
contextual feature that
sense of the ambiguous

on from all words in the
classifier

Supervised Di

An information-theor

„ Two senses of the word

„ Prendre une mesure Æ t
„ Prendre une decision Æ

„ Flip-Flop algorithm <Bro

„ Let {t1,…,tm} be the tra
word

„ Let {x1,…xm} be the po
„ For prendre {t1,…,tm} Æ
„ For prendre {x1,…xm} Æ

isambiguation

retic approach (2/5)

d prendre

take a measure
make a decision

own et al.>

anslations of the ambiguous

ossible values of the indicator
Æ {take, make, rise, speak}
Æ {mesure, note, exemple,

decision, parole}

Supervised Di
An information-theo

„ Flip-Flop Algorithm:

„ find a random partition P

„ while (improving) do

„ find partition Q={Q1
„ that maximi
„
„ find partition P={P1
that maximi
„ end

„ Each iteration of the alg
mutual information I(P;

I (P;Q) = ∑ ∑ p
t∈P x∈Q

isambiguation

oretic approach (3/5)

P={P1,P2} for {t1,…, tm}

1, Q2} of {x1,…,xn}
izes I(P;Q)
1, P2} of {t1,…, tm}
izes I(P;Q)

gorithm increases the
;Q) monotonically.

p(t, x) log p(t, x)
p(t) p(x)

Supervised Di

An information-theor

„ The initiation partition p

„ P1={take, rise} P2={ma

„ Let’s assume prendre is

„ Q1={measure, note, exe
„ Q2={decision, parole}
„ Since this partition will m

„ The 2nd partition p

„ P1={take} P2={male, s

isambiguation

retic approach (4/5)

p

ake, speak}

s translated by take, so

emple}

maximize I(P;Q)

speak, rise}

Supervised Di

An information-theor

„ Disambiguation

„ For the occurrence of the
the value of the indicator

„ If the value is in Q1, ass
if the value is in Q2, assi

isambiguation

retic approach (5/5)

e ambiguous word, determine
r
sign the occurrence to sense 1
ign the occurrence to sense 2

Dictionary-Based

„ If we have no informati
categorization of a word

„ Relying on the senses in

„ Disambiguation based o
„ Thesaurus-based disam
„ Disambiguation based o

second-language corpus
„ One sense per discourse

collocation

d Disambiguation

ion about the sense
d

dictionaries and thesauri.

on sense definitions
mbiguation
on translations in a

s
e, one sense per

Dictionary-Based

disambiguation-based on

„ D1,..., Dk the dictionar
sreepnrseessenste1 ,d..a.s, tshke of th
definition. bag

„ v j is the word occurring

„ Ethvej is the dictionary defi
sense definitions of

d Disambiguation

n sense definitions (1/3)

ry definitions of the
he ambiguous word w,
of words occurring

g in the context c of w

inition of v j (union of all
v j)

Dictionary-Based

disambiguation-based on

„ The algorithm:
„ Given a context c for a

„ For all senses s1,…,sk of

„ score(sk ) = overlap

that is, overlap ( word set o
word set of dictionary defi

„ end

„ Choose the sense with hig

d Disambiguation

n sense definitions (2/3)

word w

f w do

( Dk , U v j in c Ev j )

of dictionary definition of sense Sk,
inition of Vj in context c )

ghest score.

Dictionary-Based

disambiguation-based on

„ Example ( Two Senses

„ Senses

„ S1 tree at

„ S2 burned stuff the

„ com

„ Score

„ S1 S2

„0 1 This cigar b
The ash is o
„1 0 leaf.

d Disambiguation

n sense definitions (3/3)

of ash ):

Definition
tree of the olive family
e solid residue left when
mbustible material is burned

Context

burns slowly and creates a stiff ash
one of the last trees to com into

Dictionary-Based

Thesaurus based di

„ This exploits the semantic c
thesaurus like Roget’s.

„ Semantic categories of the
Ædecide the semantic cate
Æthen decide which word s

„ (Walker,1987):Each word
subject codes which corresp
meanings.

„ For each subject code, we
(from the context) having th

„ We select the subject code
count.

d Disambiguation

isambiguation (1/2)

categorization provided by a

words in a context
egory of the context
sense are used
d is assigned one or more
ponds to its different

count the number of words
he same subject code.
e corresponding to the highest

Dictionary-Based

Thesaurus based di

„ Walker’s Algorithm

comment : given con
for all senses sk of

∑score(sk ) = v j

end
choose s, s.t. s, = arg

„ The unit value is either

d Disambiguation

isambiguation (2/2)

ntext c
w do

j in c ζ (t(sk ), v j )

g max sk score(sk )

1 or 0

Dictionary-Based

disambiguation-based on tran
(1

„ This method makes use
correspondences in a bi

„ First language

„ The one for which we wa

„ Second language

„ Target language in the b

„ For example, if we want
based on German corpu
language, and the Germ

d Disambiguation

nslations in a second-language
1/3)

e of word
ilingual dictionary.

ant to do disambiguation

bilingual dictionary

t to disambiguate English
us, then English is the 1st
man is the 2nd language.

Dictionary-Based

disambiguation-based on tran
(2/

„ For the word “interest” :

„ Sense1

„ Definition legal sha
„ Translation Beteiligun
„ Collocation acquire an in
„ Translation Beteiligung e

d Disambiguation

nslations in a second-language
/3)

:

1 Sense2

are attention, concern
ng Interesse
nterest show interest
erwerben Interesse zeigen


Click to View FlipBook Version