As the World Wide Web continues to grow at an exponential rate, the size and complexity of many web sites grow along with it. For the users of these web sites it becomes increasingly difficult and time consuming to find the information they are looking for. User interfaces could help users find the information that is in accordance with their interests by personalizing a web site.
Some web sites present users with personalized information by letting them choose from a set of predefined topics of interest. Users however do not always know what they are interested in beforehand and their interests may change overtime which would require them to change their selection frequently. Recommender systems provide personalized information by learning the user’s interests from traces of interaction with that user.
In order for a recommender system to make predictions about a user’s interests it has to learn a user model. A user model contains data about the user and should be represented in such a way that the data can be matched to the items in the collection. The question is, what kind of data can be used to construct a user profile. Obviously the items that users have seen in the past are important but other information such as the content of the items, the perception of users of the items or information about users themselves could also be used.
The next question is how to represent this data. The words in the texts should be represented in such a way that they can be used to differentiate between documents about different topics. Another important issue is how time influences the user profile. The interests of users usually do not remain the same but change over time. The data in the user model should therefore be constantly adjusted so that it remains in accordance with the user’s interests.
Most recommender systems focus on the task of information filtering, which deals with the delivery of items selected from a large collection that the user is likely to find interesting or useful. Recommender systems are special types of information filtering systems that suggest items to users. Some of the largest e-commerce sites are using recommender systems and apply a marketing strategy that is referred to as mass customization.
There are two main approaches to information filtering: Collaborative filtering and content-based filtering. Collaborative filtering select items based on the similarities between the preferences of different users. Content-based filtering selects items based on the similarities between the content description of an item and the user’s preferences. A hybrid approach, combining collaborative filtering and content-based filtering also exists.
A content-based filtering system often uses many of the same techniques as an information retrieval system (such as a search engine), because both systems require a content description of the items in their domain. A recommender system also requires the modeling of the user’s preferences for a longer period of time which is not needed in an information retrieval system. There are several techniques that can be used to improve recommender systems in different ways. These techniques fall in the category of web mining, a research field that is closely related to data mining. Web mining is the application of algorithms for extraction knowledge from internet data sources such as server log files and large document collections.