This code provides an implementation of the k-Nearest Neighbors (k-NN) classification algorithm. The goal of the algorithm is to assign classes to unlabeled examples based on the classes of similar known examples.
-
read_examples(file_name)
: This function reads a file called 'iris.data.csv' that contains classified examples and returns a matrix containing these examples, excluding the header. -
read_unclassified(file_name)
: This function reads a provided file containing unlabeled examples and returns a matrix with these examples. -
euclidean_distance(known_example, unknown_example)
: This function calculates the Euclidean distance between a known example and an unlabeled example based on their numerical features. -
calculate_distances(classified_examples, unclassified_examples, k)
: This function receives matrices of classified and unclassified examples, along with a value k, and calculates the distances between the unclassified examples and the classified examples. The function returns a matrix containing the k nearest instances for each unclassified example, along with their respective classes. -
determine_class(nearest_instances)
: This function receives the matrix generated by the previous function and determines the most frequent class among the k nearest instances for each unclassified example. In case of a tie, the class is randomly selected. The function also writes the determined classes to the 'iris.data.csv' file. -
main()
: This function is responsible for the main execution of the program. It prompts the user for a value of k and the name of a file containing the unclassified examples. It then calls the necessary functions to classify the examples and determine their classes.
The code uses the math
library for mathematical calculations and the choice
function from the random
module to perform random selection in case of a tie in class determination.
At the end, the determined classes for the unclassified examples are printed on the screen and added to the 'iris.data.csv' file.
-
Make sure you have the 'iris.data.csv' file in the same directory as the script.
-
Execute the script.
-
Enter the value of k when prompted.
-
Enter the name of the file containing the unclassified examples when prompted.
-
The determined classes for the unclassified examples will be displayed on the screen and added to the 'iris.data.csv' file.
Note: Make sure the unclassified examples file is properly formatted and contains the same number of features as the classified examples in the 'iris.data.csv' file.
Please feel free to modify the code according to your specific needs.