5/30/2017

k-nearest neighbor classification

Today, I read materials about k-nearest neighbor classification algorithm.

Summary:

In last Thursday's group meeting, I was asked the iteration times of KNN in my paper. I remembered that I did not set it in my codes. Today, I look into it and read some materials online.
Actually, KNN does not need to set iteration times. The reasons are as follows.
In KNN model, first we set k, for example k=3. Then, we select 80% of all data to train the model. When training the model, 3 points are selected randomly first. all other points are used to calculate the distance between them and classified into the one which has the nearest distance to be the same cluster. Then the sum of all distances are compared. After calculating all possible conditions, we can obtain the smallest one, so the model is built. As a result, there is no need to set iteration times because we do not iterate. Instead, we have the goal function and we can get the global minimum, namely the smallest distance.
So what is the difference between ANN and KNN? I think one big difference is that for ANN, every time you train an ANN model, they will not be exactly the same. But for KNN, as long as you have the same k value and same training data, you will obtain exactly the same model with enough calculations.

Tomorrow, I will continue to read more about my research and discuss with you about future work.



No comments:

Post a Comment