Machine Learning in Java
上QQ阅读APP看书,第一时间看更新

Finding similar items

Many problems can be formulated as finding similar sets of elements, for example, customers who purchased similar products, web pages with similar content, images with similar objects, users who visited similar websites, and so on.

Two items are considered similar if they are a small distance apart. The main questions are how each item is represented and how the distance between the items is defined. There are two main classes of distance measures:

  • Euclidean distances
  • Non-Euclidean distances