Relevance feedback was introduced by Rocchio as an information retrieval utility that implements retrieval in several passes. In Rocchio’s approach a query is represented as a vector Q in the vector space model. With this query a set of documents is retrieved which are rated by the user as either relevant or irrelevant. This provides a set R that contains the relevant document vectors and a set S that contains the
irrelevant document vectors. A new query
is constructed that refines the old query Q by using the following equation:
All vector modifications are normalized by the number of relevant and irrelevant documents to ensure that the new information does not completely override the original query. After the refinement the new query is re-executed and the process of query refinement can be repeated again.
An example of an information filtering system that incorporates relevance feedback is LIRA [Balabanovic & Shoham 1995], an agent that autonomously searches the World Wide Web for interesting web pages. In LIRA a user profile vector P is used to describe the user’s interests. Every time a new document is rated the profile vector P is adjusted by a simple addition:
where is an integer in the range of -5 and +5 representing the evaluation a user has given to document
.