One long-standing strategy for collaborative filtering is to create a sparse matrix
Such an approach encodes an assumption that interaction patterns can be adequately expressed as a linear combination of latent factors.
Finding the factor matrices
Numerical approximation approach
For relatively small
where
Optimization approach
For larger
In this case, we start with interaction data for
In PyTorch, this would typically be done using nn.Embedding layers:
class MatrixFactorization(nn.Module):
def __init__(self, num_users, num_items, num_factors):
super(MatrixFactorization, self).__init__()
self.user_factors = nn.Embedding(num_users, num_factors)
self.item_factors = nn.Embedding(num_items, num_factors)
def forward(self, user, item):
user_embedding = self.user_factors(user)
item_embedding = self.item_factors(item)
return (user_embedding * item_embedding).sum(1)
The training data would consist of an equal number of sampled positive and negative interaction examples; i.e., cases where the user did or did not interact with the product.
Making recommendations for known users
The forward method gives you a pointwise interaction. If we now wish to recommend products for known users, we can project the user’s latent representation
The resulting product probabilities can be ranked to produce a candidate set for recommendation.
Cold-start recommendation
Often, we must make recommendations for unseen users. New users have no latent representation. In this case, we treat the user as a vector of interactions. In this “warm-start” case, only the product matrix
Let’s say you have a new user for whom you have some interaction data. The user’s interaction data is represented as an
Having so done, we can now project this embedding back into the product space, resulting in recommendations for each product:
So given a vector of user interactions
In a completely cold-start scenario, where no user interaction data exists, one can average over the set of known user embeddings to obtain