-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
several improvements #80
base: master
Are you sure you want to change the base?
Conversation
Before merging this, I think the three new activation functions should be documented in fann_data.h. Also, corresponding variants should be added to the C++ enum |
src/include/CMakeLists.txt
Outdated
@@ -1,4 +1,4 @@ | |||
########### install files ############### | |||
|
|||
INSTALL_FILES( /include FILES fann.h doublefann.h fann_internal.h floatfann.h fann_data.h fixedfann.h compat_time.h fann_activation.h fann_cascade.h fann_error.h fann_train.h fann_io.h fann_cpp.h ) | |||
INSTALL_FILES( /include/fann FILES fann.h doublefann.h fann_internal.h floatfann.h fann_data.h fixedfann.h fann_activation.h fann_cascade.h fann_error.h fann_train.h fann_io.h fann_cpp.h ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bukka Removing compat_time.h seems fine, but do we really want headers to be installed in a "fann" subdirectory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bukka @andersfylling @steffennissen I've gone through the commits on this PR and added my comments where appropriate. Since I'm still new to this repository, I wouldn't mind another pair of eyes going over this. Feedback on this review would be great as well, if possible. (:
Overall, most of the changes seem alright to me, though they lack doc comments. I'm somewhat worried about commit a3863a70, which adds labelled weights to irpropm training, and commit cc684290, which adds irpropm training with a custom error function. They copy-paste a lot of code in order to make miniscule changes very deep in the call stack. That seems like a maintenance hazard to me. Is this fine to merge or do we prefer a more general solution, maybe based on new options on the struct fann
?
@@ -212,6 +212,7 @@ FANN_EXTERNAL void FANN_API fann_train_on_file(struct fann *ann, const char *fil | |||
This function appears in FANN >= 1.2.0. | |||
*/ | |||
FANN_EXTERNAL float FANN_API fann_train_epoch(struct fann *ann, struct fann_train_data *data); | |||
FANN_EXTERNAL float FANN_API fann_train_epoch_lw(struct fann *ann, struct fann_train_data *data, fann_type* label_weight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function and the others fann_*_lw()
functions should get documentation in the same style as the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We likely also want to add train_epoch_lw()
in some way to the FANN::neural_net class in fann_cpp.h.
case FANN_TRAIN_RPROP: | ||
return fann_train_epoch_irpropm_lw(ann, data, label_weight); | ||
default: | ||
printf("FANN : fann_train_epoch_lw not implemented with others algo\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense to call:
fann_error((struct fann_error *) ann, FANN_E_CANT_USE_TRAIN_ALG);
here and fall through to the return statement.
neuron_diff2 = (float) (neuron_diff * neuron_diff); | ||
#endif | ||
|
||
ann->MSE_value += neuron_diff2 * label_weight; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got no experience with the FIXEDFANN mode, do we have to divide label_weight
by (float) ann->multiplier
here?
src/include/fann_train.h
Outdated
@@ -213,6 +213,9 @@ FANN_EXTERNAL void FANN_API fann_train_on_file(struct fann *ann, const char *fil | |||
*/ | |||
FANN_EXTERNAL float FANN_API fann_train_epoch(struct fann *ann, struct fann_train_data *data); | |||
FANN_EXTERNAL float FANN_API fann_train_epoch_lw(struct fann *ann, struct fann_train_data *data, fann_type* label_weight); | |||
FANN_EXTERNAL float FANN_API fann_train_epoch_irpropm_gradient(struct fann *ann, struct fann_train_data *data, fann_type (*errorFunction)(fann_type*,fann_type*,int,void*),void*); | |||
FANN_EXTERNAL void FANN_API fann_compute_MSE_gradient(struct fann *, fann_type *, fann_type (*errorFunction)(fann_type*,fann_type*,int,void*), void*); | |||
FANN_EXTERNAL void FANN_API fann_backpropagate_MSE_firstlayer(struct fann *); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fann_train_epoch_irpropm_gradient()
needs documentation.
fann_compute_MSE_gradient()
is an internal function and should not be mentioned in this header.
fann_backpropagate_MSE_firstlayer()
does not seem to serve any purpose at all. Do we really need it?
last_layer_begin->activation_steepness, neuron_value, | ||
last_layer_begin->sum) * neuron_diff; | ||
//printf("DEBUG _mse_grad %lf %lf %lf %lf\n", *error_it, neuron_diff, last_layer_begin->activation_function, last_layer_begin->activation_steepness ); | ||
//desired_output++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These commented-out sections should likely be removed before merging the code.
@@ -443,6 +444,7 @@ struct fann *fann_create_from_fd(FILE * conf, const char *configuration_file) | |||
fann_scanf("%u", "network_type", &tmpVal); | |||
ann->network_type = (enum fann_nettype_enum)tmpVal; | |||
fann_scanf("%f", "learning_momentum", &ann->learning_momentum); | |||
fann_scanf("%f", "learning_l2_norm", &ann->learning_l2_norm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for fann_save_internal_fd()
further up.
@@ -769,7 +769,8 @@ FANN_EXTERNAL float FANN_API fann_get_learning_momentum(struct fann *ann); | |||
This function appears in FANN >= 2.0.0. | |||
*/ | |||
FANN_EXTERNAL void FANN_API fann_set_learning_momentum(struct fann *ann, float learning_momentum); | |||
|
|||
FANN_EXTERNAL float FANN_API fann_get_learning_l2_norm(struct fann *ann); | |||
FANN_EXTERNAL void FANN_API fann_set_learning_l2_norm(struct fann *ann, float learning_l2_norm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions are external and should be documented.
@@ -491,6 +491,7 @@ struct fann | |||
|
|||
/* The learning momentum used for backpropagation algorithm. */ | |||
float learning_momentum; | |||
float learning_l2_norm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is learning_l2_norm
the best name for this parameter? It's actually not the L2 norm, but rather the weight of the L2 norm relative to the error function.
@@ -978,7 +978,7 @@ void fann_update_weights_irpropm(struct fann *ann, unsigned int first_weight, un | |||
for(; i != past_end; i++) | |||
{ | |||
prev_step = fann_max(prev_steps[i], (fann_type) 0.0001); /* prev_step may not be zero because then the training will stop */ | |||
slope = train_slopes[i]; | |||
slope = train_slopes[i] - ann->learning_l2_norm * weights[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only adds the L2 regularization to the irpropm algorithm. The same term should be added to the other fann_update_weights*()
functions.
src/include/CMakeLists.txt
Outdated
SET(PARALLEL_INCLUDES parallel_fann.h parallel_fann.hpp) | ||
ENDIF(NOT OPENMP_FOUND OR DISABLE_PARALLEL_FANN) | ||
|
||
install (FILES fann.h doublefann.h fann_internal.h floatfann.h fann_data.h fixedfann.h fann_activation.h fann_cascade.h fann_error.h fann_train.h fann_io.h fann_cpp.h fann_data_cpp.h fann_training_data_cpp.h ${PARALLEL_INCLUDES} DESTINATION ${INCLUDE_INSTALL_DIR}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency's sake, install
should be INSTALL
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This INSTALL
command was added in CMake 2.4, which I think is old enough for us to assume that everyone has it. The CMakeLists.txt file itself does not seem to require any minimum CMake version.
@troiganto Nice review! I agree that the docs should be updated and c++ headers added. I think it would be also good to have some tests (googletest) or at least some examples before it gets merged. |
Affects issue #72 |
Is there anyone keeping this alive? |
Hi @mrrstrat, I'm still here! I've been writing my doctoral thesis this past year and so swamped with it that I didn't really have time for side projects like this one. I also haven't heard from the other maintainers in a while, unfortunately … I'd really like to get back to FANN once this is over. (It should be soon. If you haven't heard from me in ~2 months, feel free to ping again.) Regarding this particular PR, it seems the submitter abandoned it after I requested some changes. I'm not particularly sure what the best process is here. I guess someone else has to incorporate the changes and resubmit the diff as a new PR? |
I have worked with FANN since about 2004 - a couple years ago I manually put in some of the ReLU changes discussed but did not get a lot of comparative regression tests against other transfer function types. Its probably wishful thinking to 100% rely on a rectified linear function to completely solve a dying gradient problem. Handling this in the past involved changing network architecture, network and system parameters, sampling, scope, training, restructure the problem and desired solution. Indeterminate slope derivation can be a funny animal to corral :-). It looks like the time elapsed enough that another PR is needed, re-introduction of the changes (the main branch and the changes are no longer contemporary with one another but I might be wrong). |
fix: rprop learning
new layer type (lecun, relu, leaky relu)
L2/L1 norm & regularization
possibility to define own error function
possibility to weight each sample