Information retrieval is the selection of candidate items from a larger set of items. Examples include exact or approximate similarity search (as from a full-text or vector store) and ranking (as with recommendations).
Information retrieval techniques can be roughly broken down into two large classes, though considerable overlap exists:
- index-based information retrieval, in which some distilled representation of each candidate item is precomputed and subsequently queried; and
- learning-to-rank, in which a prediction is made from a statistical model each time a new query is given.
In practice, the two approaches are often combined sequentially (as with a cascade ranking system). Additionally, some procedures inherently depend on both processes, as when ranking by selecting the nearest neighbors to a learned embedding.