report.tex

\documentclass[11pt]{article}

\usepackage{amsfonts}
\usepackage[utf8]{inputenc}
\usepackage{graphicx}
\usepackage{wrapfig}
\usepackage[english]{babel}
\usepackage{listings}
\usepackage{indentfirst}
\usepackage{multirow}
\usepackage{subcaption}

\usepackage[a4paper, left=3cm, right=3cm, top=3cm]{geometry}

\graphicspath{{diagrams/}}

\begin{document}

\begin{center}
  \Huge\textbf{Using Self-Organizing Maps to solve Travelling Salesman
    Problem}\\
  \vspace{1cm}
  \large\textsc{Maximilian Leonard Kleinhans \& Diego Vicente Martín}
\end{center}

\section{System Overview}

The goal of this system is to use Self-Organizing Maps to solve the Travelling
Salesman Problem. To achieve that, we will create a population of neurons that
is spatially distributed on a ring structure and as the algorithm progresses,
neurons are pulled towards the cities. By defining a 1D neighborhood in this 2D
space, we will make sure that the population behaves as an elastic ring that
stretches until it occupies all the cities, while it tries to keep the perimeter
as short as possible. Once we achieve it, we only have to traverse the cities
following the ring to obtain a TSP solution.\\

The main component of the system is located in \texttt{som.py}: First we scale
down the data, so that for each city coordinate $ c= (x, y) ~ 0 \le x,y \le 1$
holds true. Further is here the main execution loop located. In each iteration,
we pick a random city. Then we compute the best matching unit, that is closest
neuron to this city. That neuron, and its neighborhood, will be updated
following the formula:

$$
w_v(s+1) \; \leftarrow \;
w_v(s) + \alpha (s) \cdot \eta (u, v, s) \cdot (D(t) - w_v)
$$

Where $s$ is the iteration in which the neuron $w_u$ has been chosen as the
winner of the city $D(t)$, and the formula is used to update all the neurons
$w_v$ located in its neighborhood. The parameters in the formula are the
learning rate $\alpha(s)$,  the neighborhood function  $\eta(u, v, s)$. We also
have to remark that those last parameters decay over time. These weights
associated to each vector are initialized in our system to random values that
will be easily corrected through the first iterations, there is no point in
wasting time or computations in finding different ways to initialize the
network.\\

To achieve decaying, we will use different types of decay functions, that are
represented in the \texttt{decay.py} file: static decay (that maintains the same
value for that parameter through all the iterations), linear decay (that linearly
decreases the value of the parameter), or exponential decay (that exponentially
decays the value of the parameter). For convenience, these parameters have been
implemented as classes that can be passed to the computation methods.\\

The calculation of a TSP solution from a list of neuron is done as described in
chapter 4 in the compendium. That is, for each city we select the closest neuron
and then define the city order through the order of the selected neurons.\\

Also relevant to the system is to notice that we have implemented two different
neighborhood functions: \texttt{bubble}, constant in the while neighborhood, 0
elsewhere and \text{gaussian}, that creates a Gaussian distribution around the
winner. Both implementations can be find in the file
\texttt{neighborhood.py}.\\

Apart from these computation files, we also have implemented helper methods for
reading data sets from files and managing user input  in \texttt{helper.py}.
Furthermore also methods for plotting and saving snapshots of the system at
different moments of the execution, to see how the system gets to a certain
solution.\\

\section{Tuning the System}

Once the system was implemented, we still had to work with it to get good
results. Different combinations of parameters lay different results,
and we had to be comprehensive when testing for parameters.\\

As a general conclusion, static decay is generally not a good decay approach to
use. By keeping the parameters with a static value during all the execution is
much harder to make the system converge into a solution. It is still possible to
get decent results, but overall it is not worth to use static decay. If, in the
other hand, we use linear, we can see the system converging much better to a
certain solution: in the beginning the system tries different strategies to then
focus on fixing the small details of the chosen one. More or less the same
approach can be seen with exponential, but thanks to its nature it is able to
stabilize much faster than linear. That's why, overall, the best decay strategy
to use is exponential although linear can also lay great results (specially when
applied to the learning rate alone). Another conclusion we can make is, that
with increasing number of cities linear performs worse and in case of the
\texttt{uruguay} data set with 734 cities we could not get decent results using
linear decay.\\

Regarding neighborhood functions, the one that lays the best result is the
Gaussian function. Although bubble does not lay bad results, we can see how
Gaussian moves the network in a much softer fashion, due to its gradient to the
edges versus the 1 or 0 values or bubble. Nevertheless bubble is much faster to
compute and for increasing number of cities it might be more suitable in terms
of runtime.\\

Other parameters to be tuned are the initial value of learning rates and
neighbourhood size, although that depends on great part on the data set and the
decay function we are going to use: static will need lower values and we will be
able to use higher values at the beginning if we use a decay strategy. We also
have to think about the size of the population of neurons. As a general rule, we
will need more neurons than cities so we can model the path through all the
cities in the map. As we have been able to test, the optimal size of the
population to get a network with a proper size is about 4-8 times greater than
the city population of the data set.\\

\newpage

By doing this tuning, we came to the conclusion that these were the best parameters for each of the decay strategies:

\begin{table}[ht]
  \begin{center}
    \begin{tabular}{| p{1.8cm}| c | c | c | c | c | c | c |}
      \hline
      Map & Decay & Neurons & $N$
      & $\alpha_0$ (decay) & $r_0$ (decay)& TSP Dist.\\
      \hline
      \hline
      \multirow{3}{*}{\textbf{\parbox{1.7cm}{Western Sahara}}}
        & Static & 174 & 5000  & 0.3 & 5 & 34814.67 \\
        & Linear & 232 & 500  & 0.7 (0.001) & 23 (0.04) & 27620.78 \\
        & Exponential & 232 & 500  & 0.7 (0.999) & 23 (0.995)& 27601.17 \\
      \hline
      \multirow{3}{*}{\textbf{Djibouti}}
        & Static & 356 & 5000  & 0.2 & 10 & 7692.15\\
        & Linear & 712 & 5000  & 0.7 (0.0001)& 71 (0.014)& 6659.91 \\
        & Exponential & 712 & 1000  & 0.7 (0.999)& 71 (0.995)& 6659.91 \\
      \hline
      \multirow{3}{*}{\textbf{Qatar}}
        & Static & 1164 & 5000  & 0.4 & 50 & 14678.38 \\
        & Linear & 1552 & 10000  & 0.7 (0.000065) & 155 (0.015) &10284.41 \\
        & Exponential & 1552 & 5000  & 0.9 (0.9999)& 155 (0.997)& 10195.32 \\
      \hline
      \multirow{3}{*}{\textbf{Uruguay}}
        & Static & 5872 & 10000  & 0.4 & 500 & 357201.05 \\
        & Linear & 5872 & 10000  & 0.9 (0.000089)& 587 (0.055)& 141536.50 \\
        & Exponential & 5872 & 10000  & 0.7 (0.9999)& 587 (0.999) & 86819.03 \\
      \hline
    \end{tabular}
  \end{center}
  \caption{Chosen parameters and results for TSP-SOM - $N$ is the number of iterations, $a_0$ the initial learning rate, $r_0$ the initial radius}
  \label{tab:params}
\end{table}

In figures \ref{fig:sahara}, \ref{fig:djibouti}, \ref{fig:qatar} and \ref{fig:uruguay} are the TSP paths with different decay functions shown. The algorithm was executed with parameters shown in table \ref{tab:params}. Red dots illustrate cities and green dots neuron positions. We can see, that we consistently could obtain the best results with exponential decay, followed by linear. Apparently when not using exponential decay the algorithm is not able to refine the path in the end, resulting in inferior solutions. \\
Finally are in figures \ref{fig:bsahara}, \ref{fig:bdjibouti}, \ref{fig:bqatar} and \ref{fig:buruguay} the best runs illustrates, at the beginning in the middle and the final result of each run.

\begin{figure}
\centering
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,
  width=\linewidth]{w_s.png}
  \caption{Static decay}
  \label{fig:ws}
\end{subfigure}%
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,width=\linewidth]{w_l.png}
  \caption{Linear decay}
  \label{fig:wl}
\end{subfigure}
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,width=\linewidth]{w_e.png}
  \caption{Exponential decay}
  \label{fig:we}
\end{subfigure}
\caption{Results on \texttt{western\_sahara}}
\label{fig:sahara}
\end{figure}

\begin{figure}
\centering
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,
  width=\linewidth]{d_s.png}
  \caption{Static decay}
\end{subfigure}%
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,width=\linewidth]{d_l.png}
  \caption{Linear decay}
\end{subfigure}
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,width=\linewidth]{d_e.png}
  \caption{Exponential decay}
\end{subfigure}
\caption{Results on \texttt{djibouti}}
\label{fig:djibouti}
\end{figure}

\begin{figure}
\centering
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,
  width=\linewidth]{q_s.png}
  \caption{Static decay}
\end{subfigure}%
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,width=\linewidth]{q_l.png}
  \caption{Linear decay}
\end{subfigure}
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,width=\linewidth]{q_e.png}
  \caption{Exponential decay}
\end{subfigure}
\caption{Results on \texttt{qatar}}
\label{fig:qatar}
\end{figure}

\begin{figure}
\centering
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,
  width=\linewidth]{u_s.png}
  \caption{Static decay}
\end{subfigure}%
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,width=\linewidth]{u_l.png}
  \caption{Linear decay}
\end{subfigure}
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,width=\linewidth]{u_e.png}
  \caption{Exponential decay}
\end{subfigure}
\caption{Results on \texttt{uruguay}}
\label{fig:uruguay}
\end{figure}

\begin{figure}
\centering
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,
  width=\linewidth]{w_0.png}
  \caption{Initial state}
\end{subfigure}%
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,width=\linewidth]{w_250.png}
  \caption{After 250 iterations}
\end{subfigure}
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,width=\linewidth]{w_500.png}
  \caption{After 500 iterations}
\end{subfigure}
\caption{Best results on \texttt{western\_sahara} with exponential decay}
\label{fig:bsahara}
\end{figure}

\begin{figure}
\centering
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 1cm 4cm 1cm}, clip=true,
  width=\linewidth]{d_0.png}
  \caption{Initial state}
\end{subfigure}%
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,width=\linewidth]{d_500.png}
  \caption{After 500 iterations}
\end{subfigure}
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,width=\linewidth]{d_1000.png}
  \caption{After 1000 iterations}
\end{subfigure}
\caption{Best results on \texttt{djibouti} with exponential decay}
\label{fig:bdjibouti}
\end{figure}

\begin{figure}
\centering
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={3cm 1.5cm 3cm 1.5cm}, clip=true,
  width=\linewidth]{q_0.png}
  \caption{Initial state}
\end{subfigure}%
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,width=\linewidth]{q_2500.png}
  \caption{After 2500 iterations}
\end{subfigure}
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={6cm 2cm 6cm 2cm}, clip=true,width=\linewidth]{q_5000.png}
  \caption{After 5000 iterations}
\end{subfigure}
\caption{Best results on \texttt{qatar} with exponential decay}
\label{fig:bqatar}
\end{figure}

\begin{figure}
\centering
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,
  width=\linewidth]{u_0.png}
  \caption{Initial state}
\end{subfigure}%
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,width=\linewidth]{u_5000.png}
  \caption{After 5000 iterations}
\end{subfigure}
\begin{subfigure}{.33\textwidth}
  \centering
  \includegraphics[trim={4cm 2cm 4cm 2cm}, clip=true,width=\linewidth]{u_10000.png}
  \caption{After 10000 iterations}
\end{subfigure}
\caption{Best results on \texttt{uruguay} with exponential decay}
\label{fig:buruguay}
\end{figure}

\end{document}