Embeddings and latent factors
Magical vectors that blow my mind.
Let’s talk about embeddings because they’re cool. The first time I came across embeddings was when I was building AskYC. To build AskYC, I converted the text from the transcriptions of the YC YouTube channel videos into vectors called embeddings using OpenAI’s API.
The magical thing here is that the distance between these vectors is equivalent to semantic difference between the sentences. This is quite amazing. At that point, I didn’t really understand how any of this worked. However, now I think I have some intuition for how they work, thanks to this chapter of Deep Learning for Coders.
An embedding is a mapping of a discrete variable to a point in a vector space. Let’s think about this in the context of collaborative filtering. Collaborative filtering is a technique used to create recommender systems. On a high level, here’s how it works:
We have a dataset that contains users and their ratings for particular items (for the sake of this post, we’ll assume items are movies).
From this dataset, we try to learn which users are similar to each other and then use that to recommend new items to users.
In this case, we create vectors for each user and each item. We get to choose how many parameters we want in these vectors. The process of creating these vectors (or training) is simple gradient descent.
Initialize vectors with random values.
Calculate predictions of ratings for each user and item.
Use these predictions and the actual values to calculate a loss.
Use this loss to improve your vectors.
Each value in these vectors will eventually represent a particular latent factor. For example, the second value in each movie vector might be directly proportional to how much action there is in the movie. These latent factors are picked up by our model during the training process, which is amazing. At this point, we can calculate the distance between two movie vectors and it’s a good estimate of how similar the two movies are.
So on a high level, this is how embeddings work. As a software engineer who’s mostly experienced with concrete problems and solutions, it is a little hard for me to internalise that we know the process with which how we create these embeddings, but the latent factors just emerge automatically.