Rethinking recommendations sytems as a generative problem
Instead of searching the entire catalog, try predicting the next item that the user will interact with.
Classic recommendation systems require searching entire catalogs to match users to items. As the item catalog grows, retrieving the right items becomes increasingly expensive and costly.
In a new paper, researchers at Meta propose a generative approach to recommendation, making the costs of retrieval constant and more efficient.
The standard recommendation approach requires dense retrieval. First, the embedding of all items (products, documents, etc.) must be computed and stored. At the recommendation stage, the system searches the embedding store to find the items(s) whose embedding is the most similar to that of the user. The problem with this approach is that every item embedding must be stored and every recommendation operation requires comparing the user embedding against the entire item store.
In contrast, generative retrieval reformulates recommendations as predicting the next item in a sequence. For example, the model is given a list of products that the user has purchased and predicts the next product that the user will purchase. But how does the model predict the next item without looking at a vector store?
The key to making generative retrieval work is to compute “semantic IDs” (SIDs). These are unique identifiers that contain contextual information about each item. In generative retrieval, an encoder model is trained to take in the properties of an item and create a unique embedding value. These embedding values become the SIDs and are stored along with the item in the items catalog.
In the second stage, a Transformer model is trained through classic next-token prediction to take in a sequence of SIDs and predict the next one. For example, in a product recommendation system, the model is trained on the SIDs of a user’s purchase history.
The key advantage of generative retrieval is that it does not require a separate vector store for item embeddings. More importantly, the retrieval operation does not change regardless of the size of the item catalog because the Transformer always performs the same operation.
Generative retrieval is not without limitations. It tends to overfit to its training corpus and has difficulty handling the “the cold start problem,” which means recommending items that were added after training or understanding the preferences of new users who don’t have a purchase history.
To address these shortcomings, Meta has developed LIGER, a hybrid recommendation system that combines the advantages of generative and dense retrieval.
LIGER uses generative retrieval to select several SIDs for recommendation. It then uses dense retrieval to supplement the initial recommendations with a few cold-start items. Finally, it ranks the items by comparing the cold-start and generative recommendations.
The researchers note that “the fusion of dense and generative retrieval methods holds tremendous potential for advancing recommendation systems” and as the models evolve, “they will become increasingly practical for real-world applications, enabling more personalized and responsive user experiences.”
The efficiency of generative retrieval systems translates into immediate practical benefits, including reduced infrastructure costs and faster inference, making it useful for various applications including e-commerce recommendations and enterprise search.
Does LIGER have an already existing AI model or repo code now?