Slope One


Slope One

Collaborative filtering is a technique used by recommender systems to combine different users' opinions and tastes in order to achieve personalized recommendations. There are at least two classes of collaborative filtering: user-based techniques are derived from similarity measures between users and item-based techniques compare the ratings given by different users. Slope One is a family of algorithms used for Collaborative filtering introduced in [http://www.daniel-lemire.com/fr/abstracts/SDM2005.html Slope One Predictors for Online Rating-Based Collaborative Filtering] by Daniel Lemire and Anna Maclachlan. Arguably, it is the simplest form of non-trivial item-based collaborative filtering based on ratings. Their simplicity makes it especially easy to implement them efficiently while their accuracy is often on par with more complicated and expensive algorithms.

Item-based collaborative filtering of rated resources and overfitting

When ratings of items are available, such as is the case when people are given the option of ratings resources (between 1 and 5, for example), collaborative filtering aims to predict the ratings of one individual based on his past ratings and on a (large) database of ratings contributed by other users.

Example. Can we predict the rating an individual would give to the new Celine Dion album given that he gave the Beatles 5 out of 5?

In this context, item-based Collaborative Filtering [Slobodan Vucetic, Zoran Obradovic: Collaborative Filtering Using a Regression-Based Approach. Knowl. Inf. Syst. 7(1): 1-22 (2005)] [Badrul M. Sarwar, George Karypis, Joseph A. Konstan, John Riedl: Item-based collaborative filtering recommendation algorithms. WWW 2001: 285-295] predicts the ratings on one item based on the ratings on another item, typically using linear regression (f(x)=ax+b). Hence, if there are 1,000 items, there could be up to 1,000,000 linear regressions to be learned, and so, up to 2,000,000 regressors. This approach may suffer from severe overfittingDaniel Lemire, Anna Maclachlan, [http://arxiv.org/abs/cs/0702144 Slope One Predictors for Online Rating-Based Collaborative Filtering] , In SIAM Data Mining (SDM'05), Newport Beach, California, April 21-23, 2005.] unless we select only the pairs of items for which several users have rated both items.

A better alternative may be to learn a simpler predictor such as f(x)=x+b: experiments show that this simpler predictor (called Slope One) sometimes outperforms linear regression while having half the number of regressors. This simplified approach also reduces storage requirements and latency.

Item-based collaborative is just one form of collaborative filtering. Other alternatives include user-based collaborative filtering where relationships between users are of interest, instead. However, item-based collaborative filtering is especially scalable with respect to the number of users.

Item-based collaborative filtering of purchase statistics

We are not always given ratings: when the users provide only binary data (the item was purchased or not), then Slope One and other rating-based algorithm do not apply. Examples of binary item-based collaborative filtering include Amazon's [http://doi.ieeecomputersociety.org/10.1109/MIC.2003.1167344 item-to-item] patented algorithm [Greg Linden, Brent Smith, Jeremy York, "Amazon.com Recommendations: Item-to-Item Collaborative Filtering," IEEE Internet Computing, vol. 07, no. 1, pp. 76-80, Jan/Feb, 2003] which computes the cosine between binary vectors representing the purchases in a user-item matrix.

Being arguably simpler than even Slope One, the Item-to-Item algorithm offers an interesting point of reference. Let us consider an example.

In this case, the cosine between items 1 and 2 is

frac{(1,0,0)cdot (0,1,1) }{ Vert (1,0,0)Vert Vert (0,1,1)Vert }= 0,

the cosine betweenitems 1 and 3 is

frac{(1,0,0)cdot (1,1,0) }{ Vert (1,0,0)Vert Vert (1,1,0)Vert }= frac{1}{sqrt{2,

whereas the cosine between items 2 and 3 is

frac{(0,1,1)cdot (1,1,0)}{ Vert (0,1,1)Vert Vert (1,1,0)Vert }= frac{1}{2}.

Hence, a user visiting item 1 would receive item 3 as a recommendation, a user visiting item 2 would receive item 3 as a recommendation, and finally, a user visiting item 3 would receive item 1 (and then item 2) as a recommendation. The model uses a single parameter per pair of item (the cosine) to make the recommendation. Hence, if there are "n" items, up to "n(n-1)/2" cosines need to be computed and stored.

lope one collaborative filtering for rated resources

To drastically reduce overfitting, improve performance and ease implementation, the Slope One family of easily implemented Item-based Rating-Based Collaborative Filtering algorithms was proposed. Essentially, instead of using linear regression from one item's ratings to another item's ratings (f(x)=ax+b), it uses a simpler form of regression with a single free parameter (f(x)=x+b). The free parameter is then simply the average difference between the two items' ratings. It was shown to be much more accurate than linear regression in some instances, and it takes half the storage or less.

Example:

# Joe gave a 1 to Dion and an 1.5 to Lohan.
# Jill gave a 2 to Dion.
# How do you think Jill rated Lohan?
# The Slope One answer is to say 2.5 (1.5-1+2=2.5).

For a more realistic example, consider the following table.

In this case, the average difference in ratings between item 2 and 1 is (2+(-1))/2=0.5. Hence, on average, item 1 is rated above item 2 by 0.5. Similarly, the average difference between item 3 and 1 is 3. Hence, if we attempt to predict the rating of Lucy for item 1 using her rating for item 2, we get 2+0.5 = 2.5. Similarly, if we try to predict her rating for item 1 using her rating of item 3, we get 5+3=8.

If a user rated several items, the predictions are simply combined using a weighted average where a good choice for the weight is the number of users having rated both items. In the above example, we would predict the following rating for Lucy on item 1:

frac{2 imes 2.5 + 1 imes 8 }{2+1} = frac{13 }{3} = 4.33

Hence, given "n" items, to implement Slope One, all that is needed is to compute and store the average differences and the number of common ratings for each of the "n"2 pairs of items.

Algorithmic Complexity of Slope One

Suppose there are "n" items, "m" users, and "N" ratings. Computing the average rating differences for each pair of items requires up to "n(n-1)/2" units of storage, and up to "m n"2 time steps. This computational bound may be pessimistic: if we assume that users have rated up to "y" items, then it is possible to compute the differences in no more than "n"2+"my"2. If a user has entered "x" ratings, predicting a single rating requires "x" time steps, and predicting all of his missing ratings requires up to ("n-x")"x" time steps. Updating the database when a user has already entered "x" ratings, and enters a new one, requires "x" time steps.

It is possible to reduce storage requirements by partitionning the data (see Partition (database)) or by using sparse storage: pairs of items having no (or few) corating users can be omitted.

Recommender Systems using Slope One

* [http://hitflip.de hitflip] a DVD recommender system
* [http://www.indiscover.net inDiscover] an MP3 recommender system
* [http://iit-iti.nrc-cnrc.gc.ca/projects-projets/racofi-composer_e.html RACOFI Composer] a generic recommender system by the National Research Council
* [http://starfrosch.ch Starfrosch] an open MP3 blog community
* [http://www.valueinvestingnews.com/content-recommendations-upgrade Value Investing News] a stock market news site

Open Source software implementing Slope One

Python (programming language):
* A well documented [http://www.serpentine.com/blog/2006/12/12/collaborative-filtering-made-easy/ Python implementation] together with a tutorial

Java (programming language):
* [http://taste.sourceforge.net/ Taste] : a java-based collaborative library with support for Enterprise Java Beans ( [http://blogs.sun.com/plamere/entry/open_source_recommendation_engine_in code sample] )
* A standalone [http://www.daniel-lemire.com/fr/documents/publications/SlopeOne.java Java class] implementing Slope One.
* The [http://www.nongnu.org/cofi/ Cofi: A Java-Based Collaborative Filtering Library] supports Slope One algorithms (documentation is so-so)

PHP:
* The [http://www.vogoo-api.com Vogoo library] supports Slope One algorithms (PHP)
* The [http://www.aspedia.net Aspedia ECM API] supports Slope One algorithms (PHP)
* There is [http://www.daniel-lemire.com/fr/documents/publications/webpaper.txt PHP source code] accompanying a technical report [ Daniel Lemire, Sean McGrath, " [http://www.daniel-lemire.com/fr/abstracts/TRD01.html Implementing a Rating-Based Item-to-Item Recommender System in PHP/SQL] ", Technical Report D-01, January 2005.] on Slope One algorithms
* A module for [http://www.drupal.org/project/cre Drupal CMS] that implements Slope One.
* The [http://code.google.com/p/openslopeone OpenSlopeOne] supports Slope One algorithms, It's extremely fast, use pure php or mysql procedure (PHP)

Erlang (programming language):
* Philip Robinson [http://chlorophil.blogspot.com/2007/06/collaborative-filtering-weighted-slope.html implemented Slope One] in Erlang.

Scala (programming language):
* Steve Jenson [http://saladwithsteve.com/2007/08/weighted-slope-one-in-scala.html implemented Slope One] in Scala.

Haskell (programming language):
* Bryan O'Sullivan [http://www.serpentine.com/blog/2007/08/27/weighted-slope-one-in-haskell-collaborative-filtering-in-29-lines-of-code/ implemented Slope One] in Haskell.

Visual Basic for Applications:
* A Microsoft Excel spreadsheet [http://www.geocities.com/videogameboy76/WeightedSlopeOneDemo.xls Slope One implementation] using VBA (requires enabled macros).

C Sharp (programming language):
* Kuber [http://www.cnblogs.com/kuber/articles/SlopeOne_CSharp.html implemented Weighted Slope One] in C#.

T-SQL:
* Charlie Zhu [http://blog.charliezhu.com/2008/07/21/implementing-slope-one-in-t-sql/ implemented Weighted Slope One] in T-SQL.

Footnotes


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Slope One — Slope One  семейство алгоритмов для коллаборативной фильтрации (используемой в рекомендательных системах) для анализа различных мнений и пожеланий пользователей и выработки персональных рекомендаций. Существует как минимум 2 класса… …   Википедия

  • Slope One — Este artículo está huérfano, pues pocos o ningún artículo enlazan aquí. Por favor, introduce enlaces hacia esta página desde otros artículos relacionados …   Wikipedia Español

  • Slope — is used to describe the steepness, incline, gradient, or grade of a straight line. A higher slope value indicates a steeper incline. The slope is defined as the ratio of the rise divided by the run between two points on a line, or in other words …   Wikipedia

  • Slope Day — is an annual day of celebration held at Cornell University during the last day of regular undergraduate classes. It usually falls on the first Friday of May and the official site of Slope Day is the Libe Slope, on the university campus. Though… …   Wikipedia

  • Slope side — is a term used in the North American ski lodging industry to describe any accommodation from which one can reasonably walk to the ski lifts. Such lodgings are usually at the bottom of, or right beside, the ski hill hence the term slope side . Due …   Wikipedia

  • Slope — Slope, n. [Formed (like abode fr. abide) from OE. slipen. See {Slip}, v. i.] 1. An oblique direction; a line or direction including from a horizontal line or direction; also, sometimes, an inclination, as of one line or surface to another. [1913… …   The Collaborative International Dictionary of English

  • Slope of a plane — Slope Slope, n. [Formed (like abode fr. abide) from OE. slipen. See {Slip}, v. i.] 1. An oblique direction; a line or direction including from a horizontal line or direction; also, sometimes, an inclination, as of one line or surface to another.… …   The Collaborative International Dictionary of English

  • slope — ► NOUN 1) a surface of which one end or side is at a higher level than another. 2) a part of the side of a hill or mountain, especially as a place for skiing. ► VERB 1) be inclined from a horizontal or vertical line; slant up or down. 2) informal …   English terms dictionary

  • One Hundred Famous Views of Edo — The Plum Garden in Kameido Artist Hiroshige Year 1856–58 Type ukiyo e One Hundred Famous Views of Edo (in Japanese 名所江戸百景 Meisho Edo Hyakkei ) is a series of …   Wikipedia

  • slope — [[t]slo͟ʊp[/t]] slopes, sloping, sloped 1) N COUNT: usu with supp A slope is the side of a mountain, hill, or valley. Saint Christo is perched on a mountain slope. ...the lower slopes of the Himalayas. 2) N COUNT: usu sing A slope is a surface… …   English dictionary