Netflix has done some of the most relevant work in ML-based recommendation systems.
A new paper, based on internal research on recommendation systems at Netflix, highlights the limits of using cosine similarity in measuring the proximity of objects.
Cosine similarity measures the dot-product of two normalized vectors. By normalizing the vectors, you are discarding their magnitudes. How much does this affect similarity results?
The study suggests that in some applications, it can result in arbitrary and meaningless similarities, opaque and non-unique results.
The researchers caution against “blindly using cosine-similarity” and suggest several remedies to get better proximity measures from embedding models.
Read the full analysis on TechTalks.
For more on AI research:
It sounds crazy. The standard Pearson´s correlation coefficient is just the cosine of two vectors and by no means implies causality.