-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
19 changed files
with
2,238 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# TP3 | ||
|
||
## 72.27 - Sistemas de Inteligencia Artificial - 2º cuatrimestre 2022 | ||
|
||
### Instituto Tecnológico de Buenos Aires (ITBA) | ||
|
||
## Autores | ||
|
||
- [Sicardi, Julián Nicolas](https://github.com/Jsicardi) - Legajo 60347 | ||
- [Quintairos, Juan Ignacio](https://github.com/juaniq99) - Legajo 59715 | ||
- [Zavalia Pángaro, Salustiano Jose](https://github.com/szavalia) - Legajo 60312 | ||
|
||
## Índice | ||
- [Autores](#autores) | ||
- [Índice](#índice) | ||
- [Descripción](#descripción) | ||
- [Ejecución](#ejecución) | ||
- [Configuraciones iniciales](#configuraciones-iniciales) | ||
- [Parámetros](#parámetros) | ||
- [Ejecución del proyecto](#ejecución-del-proyecto) | ||
- [Configuraciones ejemplo](#configuraciones-ejemplo) | ||
|
||
## Descripción | ||
|
||
El proyecto fue desarrollado con Python, y esta centrado en la implementacion del algoritmo del perceptron simple (en sus variantes escalon, lineal y no lineal) y multicapa. El objetivo es el uso de estos perceptrones para la resolucion de problemas especificos. Los problemas a resolver por los perceptrones son: | ||
- Perceptron escalon: | ||
- XOR | ||
- AND | ||
- Perceptron lineal y no lineal: | ||
- Aproximacion de funcion en base a datasets `training_ej2.txt` y `zeta_ej2.txt` | ||
- Perceptron multicapa: | ||
- XOR | ||
- Identificar si un numero es par tomando su imagen del dataset `training_ej3.txt` (imagenes de 5x7 pixeles) | ||
- Identificar un numero tomando su imagen del dataset `training_ej3.txt` (imagenes de 5x7 pixeles) | ||
|
||
## Requerimientos | ||
|
||
- Python 3 | ||
|
||
## Ejecución | ||
### Configuraciones iniciales | ||
|
||
Una vez clonado el proyecto en su carpeta de preferencia, para configurar los parámetros iniciales se debe utilizar el archivo `config.json`. Este archivo posee una cierta variedad de parámetros, que se discutiran a continuacion. | ||
|
||
#### Parámetros | ||
- "perceptron_type": tipo de perceptron a utilizar : | ||
- "step": utiliza el perceptron simple escalon | ||
- "linear": utiliza el perceptron simple lineal | ||
- "non_linear": utiliza el perceptron simple no lineal con la sigmoidea especificada (ver "sigmoid_type") | ||
- "multilayer": utiliza el perceptron multicapa con las capas especificadas (ver "hidden_layers") y la sigmoidea especificada (ver "sigmoid_type") | ||
- "hidden_layers": vector donde cada posicion especifica la cantidad de neuronas por capa (ejemplo [2,3] representa una red con 2 capas ocultas donde la primera tiene 2 neuronas y la segunda 3). La capa da salida queda definida en base al problema a resolver (ver "problem"). En caso de no utilizarse el perceptron multicapa, este parametro es ignorado | ||
- "entry_file": path del archivo de dataset de entrada para problemas que lo requieran (ver "problem") | ||
- "output_file": path del archivo de dataset de salida para problemas que lo requieran (ver "problem") | ||
- "problem": problema a resolver por el perceptron: | ||
- "AND": problema del and logico (usar con tipo "step") | ||
- "XOR": problema del xor logico (usar con tipos "step" y "multilayer") | ||
- "odd_number": problema de identificar si un numero es par o impar (usar con tipo "multilayer") | ||
- "numbers": problema de identificar si un numero es un digito (usar con tipo "multilayer") | ||
- "learning_rate": tasa de aprendizaje a utilizar por perceptron | ||
- "max_epochs": epocas maximas a utilizar como criterio de corte por perceptron. De no especificarse se utilizan 100 | ||
- "min_error": error a utilizar como criterio de corte. De no especificarse se toma 0 (corte por epocas maximas) | ||
- "sigmoid_type": funcion sigmoidea a utilizar (usar con tipos "multilayer" y "non_linear"): | ||
- "tanh": tangente hiperbolica | ||
- "logistic": funcion logistica | ||
- "beta": parametro beta utilizado en funciones sigmoideas | ||
- "alpha": parametro utilizado para constante de momentum | ||
- "cross_validate": uso de cross validation para resolucion de problema | ||
- "test_proportion": en caso de usar cross_validate, proporcion del conjunto a utilizar para testeo | ||
- "softmax": uso de funcion softmax en capa de salidas (usar con tipo "multilayer") | ||
|
||
### Ejecución del proyecto | ||
|
||
Para correr el proyecto, una vez posicionado sobre el directorio base de este y habiendo configurado los parámetros iniciales, basta con ejecutar: | ||
|
||
```bash | ||
$ python3 main.py | ||
``` | ||
Al finalizar su ejecución, el programa mostrará los parámetros iniciales de ejecución, asi como tambien el error en entrenamiento y las epocas iteradas. En caso que se use cross validation, se muestra el error de entrenamiento y el error de testeo correspondiente a los pesos optimos hallados. | ||
|
||
### Configuraciones ejemplo | ||
|
||
Si por ejemplo busco resolver el problema del XOR logico utilizando el perceptron simple escalon, usando como tasa de aprendizaje 0.1, con cota de error 1e-2 y 500 epocas como corte el archivo queda de la forma: | ||
|
||
```json | ||
{ | ||
"perceptron_type" : "step", | ||
"problem": "XOR", | ||
"learning_rate" : 0.1, | ||
"max_epochs" : 500, | ||
"min_error": 1e-2, | ||
} | ||
``` | ||
|
||
Si ahora busco resolver el problema de numeros pares utilizando un perceptron multicapa usando dos capas ocultas (la primera con 2 neuronas y la segunda con 3), usando la misma tasa de aprendizaje y la cota de error del ejemplo anterior pero utilizando ahora 1000 epocas y usando cross validation con proporcion de 20% de testeo y 80% de entrenamiento. Ademas busco usar la funcion tangente hiperbolica con beta de 0.8. El archivo queda de la forma: | ||
|
||
```json | ||
{ | ||
"perceptron_type" : "multilayer", | ||
"hidden_layers": [2,3], | ||
"entry_file": "resources/training_ej3.txt", | ||
"problem": "odd_number", | ||
"learning_rate" : 0.1, | ||
"max_epochs" : 1000, | ||
"min_error": 1e-2, | ||
"sigmoid_type" : "tanh", | ||
"beta": 0.8, | ||
"cross_validate":"True", | ||
"test_proportion": 0.2, | ||
} | ||
``` | ||
|
||
Si quisiese resolver el mismo problema pero ahora utilizando momentum con alpha 0.8, basta con cambiar el archivo de la siguiente forma: | ||
|
||
```json | ||
{ | ||
"perceptron_type" : "multilayer", | ||
"hidden_layers": [2,3], | ||
"entry_file": "resources/training_ej3.txt", | ||
"problem": "odd_number", | ||
"learning_rate" : 0.1, | ||
"max_epochs" : 1000, | ||
"min_error": 1e-2, | ||
"sigmoid_type" : "tanh", | ||
"beta": 0.8, | ||
"cross_validate":"True", | ||
"test_proportion": 0.2, | ||
"alpha": 0.8 | ||
} | ||
``` |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,245 @@ | ||
import sys | ||
import numpy as np | ||
from scipy.special import softmax | ||
import random | ||
from models import Observables, Properties,Perceptron,Neuron,Layer,ThresholdNeuron | ||
from algorithms.problems import generate_noise_test_set | ||
|
||
def execute(properties:Properties): | ||
|
||
(perceptron, layers) = build_perceptron(properties) | ||
BIAS = -1 | ||
# Add threshold to training set | ||
training_set = np.insert(properties.training_set, 0, BIAS, axis=1) | ||
|
||
error = sys.maxsize | ||
min_error = sys.maxsize | ||
min_w = [] | ||
i = len(training_set) | ||
pos = 0 | ||
epochs = -1 | ||
indexes = [] | ||
while error > perceptron.min_error and epochs < perceptron.max_epochs: | ||
|
||
# Always pick at random or random until covered whole training set and then random again? | ||
if(i == len(training_set)): | ||
epochs+=1 | ||
indexes = random.sample(list(range(len(training_set))),len(list(range(len(training_set))))) | ||
i = 0 | ||
|
||
pos = indexes[i] | ||
entry = training_set[pos] | ||
|
||
activations = [] | ||
activations.append(entry) | ||
|
||
# Calculate activations (and save them) | ||
for (idx, layer) in enumerate(layers): | ||
activations.append(layer.get_activations(activations[idx])) | ||
|
||
# If softmax was requested then apply it on the output layer activations | ||
if (Properties.softmax): | ||
activations[-1] = softmax(activations[-1]).tolist() | ||
|
||
|
||
deltas = [] | ||
# Calculate error in output and one below output | ||
deltas.append(layer.get_deltas(properties.output_set[pos],None, activations[-1],False,properties)) | ||
deltas.insert(0, layers[-2].get_deltas(deltas[0],layers[-1], None, True)) | ||
|
||
# Calculate deltas (and save them) | ||
for (idx, layer) in reversed(list(enumerate(layers[:-2]))): | ||
deltas.insert(0, layer.get_deltas(deltas[0],layers[idx+1])) | ||
|
||
# Update all ws (incremental) | ||
for(idx,layer) in enumerate(layers): | ||
isOutput = (idx == len(layers) -1) | ||
layer.update_neurons(deltas[idx],activations[idx],isOutput) | ||
|
||
# Calculate error | ||
error = calculate_error(properties, training_set, properties.output_set, layers) | ||
i+=1 | ||
if error < min_error: | ||
min_error = error | ||
min_w = [] | ||
for (idx,layer) in enumerate(layers): | ||
min_w.append([]) | ||
if(idx == len(layers)-1): | ||
for neuron in layer.neurons: | ||
min_w[-1].append(neuron.w.copy()) | ||
else: | ||
for neuron in layer.neurons[1:]: | ||
min_w[-1].append(neuron.w.copy()) | ||
|
||
return Observables(min_w, min_error,epochs) | ||
|
||
|
||
def calculate_error(properties:Properties, training_set, output_set, layers): | ||
error = 0 | ||
for (i,entry) in enumerate(training_set): | ||
activations = [] | ||
activations.append(entry) | ||
for (j, layer) in enumerate(layers): | ||
activations.append(layer.get_activations(activations[j])) | ||
|
||
# If softmax was requested then apply it on the output layer activations | ||
if (Properties.softmax): | ||
activations[-1] = softmax(activations[-1]).tolist() | ||
|
||
for (idx,output_value) in enumerate(output_set[i]): | ||
error += (output_value - properties.normalized_function(properties.sigmoid_max,properties.sigmoid_min,properties.output_max,properties.output_min,activations[-1][idx]))**2 | ||
|
||
return error*(1/2) | ||
|
||
def build_perceptron(properties:Properties, test_w=None): | ||
|
||
perceptron:Perceptron = properties.perceptron | ||
BIAS = -1 | ||
Neuron.function = perceptron.function | ||
Neuron.d_function = perceptron.d_function | ||
layers = [] | ||
|
||
perceptron = properties.perceptron | ||
|
||
if(perceptron.sigmoid_type == "tanh" and not Properties.softmax): | ||
properties.sigmoid_max = 1 | ||
properties.sigmoid_min = -1 | ||
properties.output_max = np.max(properties.output_set) | ||
properties.output_min = np.min(properties.output_set) | ||
elif(perceptron.sigmoid_type == "logistic" or Properties.softmax): | ||
properties.sigmoid_max = 1 | ||
properties.sigmoid_min = 0 | ||
properties.output_max = np.max(properties.output_set) | ||
properties.output_min = np.min(properties.output_set) | ||
|
||
|
||
hidden_neuron_count = 0 | ||
# Add hidden layer layers | ||
for layer_index, neurons_count in enumerate(perceptron.neurons_per_layer): | ||
neurons = [ThresholdNeuron()] | ||
for index in range(0, neurons_count): | ||
if test_w is None: | ||
if (layer_index == 0): | ||
# First layer uses length of entry value plus the bias | ||
w = np.random.randn(len(properties.training_set[0])+1) | ||
else: | ||
w = np.random.randn(len(layers[layer_index-1].neurons)) | ||
w[0] = BIAS | ||
|
||
else: | ||
w = test_w[layer_index][index] | ||
hidden_neuron_count += 1 | ||
|
||
neurons.append(Neuron(w,perceptron.learning_rate)) | ||
layers.append(Layer(neurons)) | ||
|
||
# Add output layer | ||
# Number of neurons in output layer depends on number of camps in expected output values | ||
neurons = [] | ||
for i in range(len(properties.output_set[0])): | ||
if test_w is None: | ||
w = np.random.randn(len(layers[-1].neurons)) | ||
w[0] = BIAS | ||
|
||
else: | ||
w = test_w[-1][i] | ||
|
||
neurons.append(Neuron(w, perceptron.learning_rate)) | ||
layers.append(Layer(neurons)) | ||
|
||
return (perceptron, layers) | ||
|
||
def get_results(properties:Properties, w): | ||
(perceptron, layers) = build_perceptron(properties, w) | ||
results = [] | ||
input_set = np.insert(properties.training_set,0,1,axis=1) | ||
activations = [] | ||
results = [] | ||
error = 0 | ||
# Calculate activations (and save them) | ||
for i,entry in enumerate(input_set): | ||
activations.append(entry) | ||
for (idx, layer) in enumerate(layers): | ||
activations.append(layer.get_activations(activations[idx])) | ||
|
||
|
||
if layer == layers[-1]: # I'm in the outer layer | ||
# If softmax was requested then apply it on the output layer activations | ||
if (Properties.softmax): | ||
activations[-1] = softmax(activations[-1]).tolist() | ||
results.append(activations[-1]) | ||
for (idx,output_value) in enumerate(properties.output_set[i]): | ||
error += (1/2)*(output_value - properties.normalized_function(properties.sigmoid_max,properties.sigmoid_min,properties.output_max,properties.output_min,activations[-1][idx]))**2 | ||
|
||
activations.clear() | ||
|
||
return (results, error) | ||
|
||
def test(properties:Properties, w, metrics_function): | ||
|
||
(results, error) = get_results(properties, w) | ||
|
||
metrics = metrics_function(properties.output_set, results, properties.perceptron.problem) | ||
|
||
return (metrics, error) | ||
|
||
def noise_test(properties:Properties, observables:Observables): | ||
probabilities = np.arange(0.0, 0.11, 0.01) | ||
|
||
original_training_set = properties.training_set.copy() | ||
errors = [] | ||
for prob in probabilities: | ||
if(prob == 0): | ||
errors.append(observables.training_error) | ||
continue | ||
noise_test_set = generate_noise_test_set(original_training_set, prob) | ||
properties.training_set = noise_test_set | ||
(results,error) = get_results(properties, observables.w) | ||
errors.append(error) | ||
return (errors, probabilities) | ||
|
||
def cross_validate(properties:Properties): | ||
execute(properties) | ||
# Split input into chunks | ||
# ATTENTION! This product should be an integer in order not to lose entries | ||
segment_members = int(len(properties.training_set)*properties.test_proportion) | ||
segment_count = int(1/properties.test_proportion) | ||
sets = np.array_split(properties.training_set, segment_count) | ||
|
||
max_accuracy = -1 | ||
best_run = None | ||
original_input = properties.training_set.copy() | ||
original_output = properties.output_set.copy() | ||
|
||
for k in range(0, segment_count-1): | ||
# Build datasets by splitting into testing and training segments | ||
test_set = sets[k] | ||
test_output_set = properties.output_set[k*segment_members:(k+1)*segment_members] | ||
training_set = [] | ||
training_output_set = [] | ||
for i in range(0, segment_members*segment_count): | ||
if not (i >= k*segment_members and i < (k+1)*segment_members): | ||
training_set.append(properties.training_set[i]) | ||
training_output_set.append(properties.output_set[i]) | ||
|
||
# Train the neural network | ||
properties.training_set = training_set | ||
properties.output_set = training_output_set | ||
observables = execute(properties) | ||
|
||
|
||
# Test the neural network | ||
properties.training_set = test_set | ||
properties.output_set = test_output_set | ||
(observables.metrics, observables.test_error) = test(properties, observables.w, properties.metrics_function) | ||
|
||
# Update best run | ||
if observables.metrics[0].accuracy > max_accuracy: | ||
max_accuracy = observables.metrics[0].accuracy | ||
best_run = observables | ||
|
||
# Reset data | ||
properties.training_set = original_input | ||
properties.output_set = original_output | ||
|
||
return best_run |
Oops, something went wrong.