Alex Smola - Collaborative Filtering considered harmful

Much excellent work has been published on collaborative filtering, in particular in terms of recovering missing entries in a matrix. The Netflix contest has contributed a significant amount to the progress in the field.

Alas, reality is not quite as simple as that. Very rarely will we ever be able to query a user about arbitrary movies, books, or other objects. Instead, user ratings are typically expressed as preferences rather than absolute statements: a preference for Die Hard, given a generic set of movies only tells us that the user appreciates action movies; however, a preference for Die Hard over Terminator or Rocky suggests that the user might favor Bruce Willis over other action heroes. In other words, the context of user choice is vital when estimating user preferences.

If we attempt to estimate scores \(s[{u,i}]\) of user \(u\) regarding item \(i\) it is important to use the context within which the ratings have been obtained. For instance, if we are given a sequence of items \((i_1, \ldots i_n)\) out of which item \(i^∗\) was selected we might want to consider a logistic model of the form:

\[−\log p(i^∗|i_1, \ldots i_n)= \log \left|\sum_i \exp(s[u,i]) \right| −s[u,i^∗]\]

The option of no action is easy to add, simply by adding the null score \(s[u,0]\) which captures the event of no action by a user. Shuang Hong Yang tried out this idea to get a significant performance improvement on a number of collaborative filtering datasets. Bottom line - make sure that the problem you’re solving is actually the one that a) generated the data and b) that will help you in practice. That is, in many cases matrix completion is not the problem you want to solve, even though it might win you benchmarks. Obviously the above model is still a gross oversimplification and you’re best advised using the actual interaction order for ranking. But that’s a story for another day.