The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.

Collaborative Filtering: Models and Algorithms Andrea Montanari Jose Bento, ashY Deshpande, Adel Jaanmard,v Raghunandan Keshaan,v Sewoong Oh, Stratis Ioannidis, Nadia ...

Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by , 2016-02-01 21:45:03

Collaborative Filtering: Models and Algorithms

Collaborative Filtering: Models and Algorithms Andrea Montanari Jose Bento, ashY Deshpande, Adel Jaanmard,v Raghunandan Keshaan,v Sewoong Oh, Stratis Ioannidis, Nadia ...

Accuracy guarantees: Vignette (m = n = 2000 rank-8)

low-rank matrix M sampled matrix ME

OptSpace output M^ squared error (M   M^ )2

Andrea Montanari (Stanford) 1:25% sampled September 15, 2012 31 / 58

Collaborative Filtering

Accuracy guarantees: Vignette (m = n = 2000 rank-8)

low-rank matrix M sampled matrix ME

OptSpace output M^ squared error (M   M^ )2

Andrea Montanari (Stanford) 1:50% sampled September 15, 2012 31 / 58

Collaborative Filtering

Accuracy guarantees: Vignette (m = n = 2000 rank-8)

low-rank matrix M sampled matrix ME

OptSpace output M^ squared error (M   M^ )2

Andrea Montanari (Stanford) 1:75% sampled September 15, 2012 31 / 58

Collaborative Filtering

Accuracy guarantees: Unstructured factors

Incoherence

k k hk k ii ;u 2 k k hk k ijv
:v¡
u¡ 2 2 2
av 2 av

[Candés, Recht 2008]

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 32 / 58

Accuracy guarantees

Theorem (Keshavan, M, Oh, 2009)

jM j MAssume ij max. Then, w.h.p., rank-r projection achieves

q

j j + k k p j jM =C max nr E
RMSE ZC H E 2 n r = E :

Theorem (Keshavan, M, Oh, 2009)

( ) ( ) = ( )MLet
be incoherent with 1 M =r M O 1 . If
j j ! f ( ) gE Cn min r log n 2; r 2 log n then, w.h.p., OptSpace achieves
p
ZC HH n r E 2;
j j k kRMSEE

( ( ) )logwith complexity O nr 3
n 2.

j jp

E.g. Gaussian noise: C HHz r n= E

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 33 / 58

Accuracy guarantees

Theorem (Keshavan, M, Oh, 2009)

jM j MAssume ij max. Then, w.h.p., rank-r projection achieves

q

j j + k k p j jM =C max nr E
RMSE ZC H E 2 n r = E :

Theorem (Keshavan, M, Oh, 2009)

( ) ( ) = ( )MLet
be incoherent with 1 M =r M O 1 . If
j j ! f ( ) gE Cn min r log n 2; r 2 log n then, w.h.p., OptSpace achieves
p
ZC HH n r E 2;
j j k kRMSEE

( ( ) )logwith complexity O nr 3
n 2.

j jp

E.g. Gaussian noise: C HHz r n= E

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 33 / 58

Two surprises

Can do much better than SVD !

Error = Noise = Sampling factor

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 34 / 58

Analogous guarantees for convex relaxations

Candés, Recht, 2008 £
Candés, Plan, 2009
Candés, Tao, 2009
Gross, 2010
Negahbahn, Wainwright, 2010
Koltchinskii, Lounici, Tsybakov, 2011
...

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 35 / 58

A noisy example

n = 500; r = 4; z = 1, example from [Candés, Plan, 2009]

1.4 rank-r projection
SDP
1.2
ADMiRA
1 OptSpace
Oracle Bound
RMSE 0.8

0.6

0.4

0.2

0

j j0 100 200 300 400 500

E =n

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 36 / 58

Challenge #1: Privacy

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 37 / 58

Research question
User ratings 3 User feature vector

User ratings 3 User private attributes ???

Let us try to do it!

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 38 / 58

Research question
User ratings 3 User feature vector

User ratings 3 User private attributes ???

Let us try to do it!

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 38 / 58

Research question
User ratings 3 User feature vector

User ratings 3 User private attributes ???

Let us try to do it!

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 38 / 58

Which attribute?

Number of persons in the household

Non-obvious [ no prize :-( we won it :-) ]
Recommender has incentive
2011 CAMRa Challenge

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 39 / 58

Which attribute?

Number of persons in the household

Non-obvious [ no prize :-( we won it :-) ]
Recommender has incentive
2011 CAMRa Challenge

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 39 / 58

CAMRa dataset

m % 2 ¡ 105 users, n % 2 ¡ 104 movies
jE j % 4:5 ¡ 106 ratings (p % 0:001)

272 households of size 2
14 households of size 3
4 households of size 4

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 40 / 58

CAMRa dataset

m % 2 ¡ 105 users, n % 2 ¡ 104 movies
jE j % 4:5 ¡ 106 ratings (p % 0:001)

272 households of size 2
(14(h(ou(se(ho(ld(s (of(si(ze((3
(4 (ho(us(eh(ol(ds(o(f s(iz(e(4

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 41 / 58

CAMRa dataset

m % 2 ¡ 105 users, n % 2 ¡ 104 movies
jE j % 4:5 ¡ 106 ratings (p % 0:001)

272 households of size 2

Can you identify whether two users shared an account?
Can you identify which user watched a movie?

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 42 / 58

Short answer

Yes: For a signicant fraction of the accounts.

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 43 / 58

Two examples (Netix dataset, no ground truth)

No movie info used!

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 44 / 58

How did you do it?

Rij $ hui ; vj i + "ij

( )  Pn Rr +1; j P WatchedBy(i)o Hyperplane

Rij ; vj

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 45 / 58

How did you do it?

Rij $ hui ; vj i + "ij

( )  Pn Rr +1; j P WatchedBy(i)o Hyperplane

Rij ; vj

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 45 / 58

How did you do it? Subspace clustering

One user 3 One hyperplane
Two users 3 Two hyperplanes

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 46 / 58

Example: A two-user household (CAMRa)

Number of movie rating eventsDistance Difference Distribution for household 172, with similarity score 0.898117
80
User 1
User 2
70 Normal fit (−)
Normal fit (+)

60

50

40

30

20

10

0 2
−2 −1.5 −1 −0.5 0 0.5 1 1.5

Difference of distances of xj from the two hyperplanes

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 47 / 58

Example: Another two-user household (CAMRa)

120 Distance Difference Distribution for household 124, with similarity score 0.500787
100
User 1
User 2
Normal fit in µ ± 1.5σ

Number of movie rating events 80

60

40

20

0 −2 −1.5 −1 −0.5 0 0.5 1 1.5
−2.5 Difference of distances of xj from the two hyperplanes

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 48 / 58

Classifying households Size 1, Empirical
Size 2, Empirical
0.08 Size 1, Fitted Gamma
MOVIEX Size 2, Fitted Gamma

0.06PDF

0.04

0.02

00 (M5S0E1−MSE120)0× m / lo1g5(m0) 200

MSE1 3 MSE using one hyperplane
MSE2 3 MSE using two hyperplanes

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 49 / 58

Challenge #2: Interactivity

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 50 / 58

We want to design the system

User Recommender

We took time out of the picture.

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 51 / 58

We want to design the system

User Recommender

We took time out of the picture.

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 51 / 58

Interactive system Recom.
Recom.
t = 1 User Recom.
t = 2 User
t = 3 User

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 52 / 58

Abstract from other users

At time t
Recommender suggests movie vt ;
User gives feedback Rt

Focus on user u

Rt $ hu; vt i + "t

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 53 / 58

Abstract from other users

At time t
Recommender suggests movie vt ;
User gives feedback Rt

Focus on user u

Rt $ hu; vt i + "t

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 53 / 58

Linear bandits

Decision:

vt

Observations:

Rt $ hu; vt i + "t

Reward: =yt h iˆt

Andrea Montanari (Stanford) u ; v`

`=1

[Rusmevichientong, Tsitsiklis, 2008]

Collaborative Filtering September 15, 2012 54 / 58

Important dierences

Number of observations $ Dimensions

Cannot explore completely at random.

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 55 / 58

Simulating an interactive system (Netix data)

10 8 8
8
6Cumulative Reward 6 6
4 Cumulative Reward
2 Cumulative Reward4 4
0
2 2
0
5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30
t 0 t 0 t

3 `typical' users
Constant-optimal policy

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 56 / 58

Conclusion

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 57 / 58

Conclusion

A crucial technology for modern information networks.
Only scratched the surface.

Thanks!

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 58 / 58

Conclusion

A crucial technology for modern information networks.
Only scratched the surface.

Thanks!

Andrea Montanari (Stanford) Collaborative Filtering September 15, 2012 58 / 58


Click to View FlipBook Version