From 59fbebe72997b48377e12eff152dad62a658d3ee Mon Sep 17 00:00:00 2001 From: Sarattha-BAG Date: Tue, 23 Mar 2021 15:21:28 +0700 Subject: [PATCH 1/7] Delete unnecessary codes --- README.md | 56 +----- src/calculate_filtering_metrics.py | 8 +- src/classifier.py | 8 +- src/compare.py | 12 +- src/facenet.py | 83 ++++---- src/freeze_graph.py | 12 +- src/train_softmax.py | 68 +++---- src/validate_on_lfw.py | 16 +- tmp/align_dataset.m | 178 ----------------- tmp/detect_face_v1.m | 253 ------------------------ tmp/detect_face_v2.m | 288 --------------------------- util/plot_learning_curves.m | 300 ----------------------------- 12 files changed, 105 insertions(+), 1177 deletions(-) delete mode 100644 tmp/align_dataset.m delete mode 100644 tmp/detect_face_v1.m delete mode 100644 tmp/detect_face_v2.m delete mode 100644 util/plot_learning_curves.m diff --git a/README.md b/README.md index 9220d20e4..8e38aab05 100644 --- a/README.md +++ b/README.md @@ -1,55 +1 @@ -# Face Recognition using Tensorflow [![Build Status][travis-image]][travis] - -[travis-image]: http://travis-ci.org/davidsandberg/facenet.svg?branch=master -[travis]: http://travis-ci.org/davidsandberg/facenet - -This is a TensorFlow implementation of the face recognizer described in the paper -["FaceNet: A Unified Embedding for Face Recognition and Clustering"](http://arxiv.org/abs/1503.03832). The project also uses ideas from the paper ["Deep Face Recognition"](http://www.robots.ox.ac.uk/~vgg/publications/2015/Parkhi15/parkhi15.pdf) from the [Visual Geometry Group](http://www.robots.ox.ac.uk/~vgg/) at Oxford. - -## Compatibility -The code is tested using Tensorflow r1.7 under Ubuntu 14.04 with Python 2.7 and Python 3.5. The test cases can be found [here](https://github.com/davidsandberg/facenet/tree/master/test) and the results can be found [here](http://travis-ci.org/davidsandberg/facenet). - -## News -| Date | Update | -|----------|--------| -| 2018-04-10 | Added new models trained on Casia-WebFace and VGGFace2 (see below). Note that the models uses fixed image standardization (see [wiki](https://github.com/davidsandberg/facenet/wiki/Training-using-the-VGGFace2-dataset)). | -| 2018-03-31 | Added a new, more flexible input pipeline as well as a bunch of minor updates. | -| 2017-05-13 | Removed a bunch of older non-slim models. Moved the last bottleneck layer into the respective models. Corrected normalization of Center Loss. | -| 2017-05-06 | Added code to [train a classifier on your own images](https://github.com/davidsandberg/facenet/wiki/Train-a-classifier-on-own-images). Renamed facenet_train.py to train_tripletloss.py and facenet_train_classifier.py to train_softmax.py. | -| 2017-03-02 | Added pretrained models that generate 128-dimensional embeddings.| -| 2017-02-22 | Updated to Tensorflow r1.0. Added Continuous Integration using Travis-CI.| -| 2017-02-03 | Added models where only trainable variables has been stored in the checkpoint. These are therefore significantly smaller. | -| 2017-01-27 | Added a model trained on a subset of the MS-Celeb-1M dataset. The LFW accuracy of this model is around 0.994. | -| 2017‑01‑02 | Updated to run with Tensorflow r0.12. Not sure if it runs with older versions of Tensorflow though. | - -## Pre-trained models -| Model name | LFW accuracy | Training dataset | Architecture | -|-----------------|--------------|------------------|-------------| -| [20180408-102900](https://drive.google.com/open?id=1R77HmFADxe87GmoLwzfgMu_HY0IhcyBz) | 0.9905 | CASIA-WebFace | [Inception ResNet v1](https://github.com/davidsandberg/facenet/blob/master/src/models/inception_resnet_v1.py) | -| [20180402-114759](https://drive.google.com/open?id=1EXPBSXwTaqrSC0OhUdXNmKSh9qJUQ55-) | 0.9965 | VGGFace2 | [Inception ResNet v1](https://github.com/davidsandberg/facenet/blob/master/src/models/inception_resnet_v1.py) | - -NOTE: If you use any of the models, please do not forget to give proper credit to those providing the training dataset as well. - -## Inspiration -The code is heavily inspired by the [OpenFace](https://github.com/cmusatyalab/openface) implementation. - -## Training data -The [CASIA-WebFace](http://www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html) dataset has been used for training. This training set consists of total of 453 453 images over 10 575 identities after face detection. Some performance improvement has been seen if the dataset has been filtered before training. Some more information about how this was done will come later. -The best performing model has been trained on the [VGGFace2](https://www.robots.ox.ac.uk/~vgg/data/vgg_face2/) dataset consisting of ~3.3M faces and ~9000 classes. - -## Pre-processing - -### Face alignment using MTCNN -One problem with the above approach seems to be that the Dlib face detector misses some of the hard examples (partial occlusion, silhouettes, etc). This makes the training set too "easy" which causes the model to perform worse on other benchmarks. -To solve this, other face landmark detectors has been tested. One face landmark detector that has proven to work very well in this setting is the -[Multi-task CNN](https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html). A Matlab/Caffe implementation can be found [here](https://github.com/kpzhang93/MTCNN_face_detection_alignment) and this has been used for face alignment with very good results. A Python/Tensorflow implementation of MTCNN can be found [here](https://github.com/davidsandberg/facenet/tree/master/src/align). This implementation does not give identical results to the Matlab/Caffe implementation but the performance is very similar. - -## Running training -Currently, the best results are achieved by training the model using softmax loss. Details on how to train a model using softmax loss on the CASIA-WebFace dataset can be found on the page [Classifier training of Inception-ResNet-v1](https://github.com/davidsandberg/facenet/wiki/Classifier-training-of-inception-resnet-v1) and . - -## Pre-trained models -### Inception-ResNet-v1 model -A couple of pretrained models are provided. They are trained using softmax loss with the Inception-Resnet-v1 model. The datasets has been aligned using [MTCNN](https://github.com/davidsandberg/facenet/tree/master/src/align). - -## Performance -The accuracy on LFW for the model [20180402-114759](https://drive.google.com/open?id=1EXPBSXwTaqrSC0OhUdXNmKSh9qJUQ55-) is 0.99650+-0.00252. A description of how to run the test can be found on the page [Validate on LFW](https://github.com/davidsandberg/facenet/wiki/Validate-on-lfw). Note that the input images to the model need to be standardized using fixed image standardization (use the option `--use_fixed_image_standardization` when running e.g. `validate_on_lfw.py`). +# Face Recognition using Tensorflow v2 [![Build Status][travis-image]][travis] \ No newline at end of file diff --git a/src/calculate_filtering_metrics.py b/src/calculate_filtering_metrics.py index f60b9ae4d..9916fa726 100644 --- a/src/calculate_filtering_metrics.py +++ b/src/calculate_filtering_metrics.py @@ -54,15 +54,15 @@ def main(args): model_exp = os.path.expanduser(args.model_file) with gfile.FastGFile(model_exp,'rb') as f: - graph_def = tf.GraphDef() + graph_def = tf.compat.v1.GraphDef() graph_def.ParseFromString(f.read()) input_map={'input':image_batch, 'phase_train':False} tf.import_graph_def(graph_def, input_map=input_map, name='net') - embeddings = tf.get_default_graph().get_tensor_by_name("net/embeddings:0") + embeddings = tf.compat.v1.get_default_graph().get_tensor_by_name("net/embeddings:0") - with tf.Session() as sess: - tf.train.start_queue_runners(sess=sess) + with tf.compat.v1.Session() as sess: + tf.compat.v1.train.start_queue_runners(sess=sess) embedding_size = int(embeddings.get_shape()[1]) nrof_batches = int(math.ceil(nrof_images / args.batch_size)) diff --git a/src/classifier.py b/src/classifier.py index 749db4d6b..844b55e78 100644 --- a/src/classifier.py +++ b/src/classifier.py @@ -40,7 +40,7 @@ def main(args): with tf.Graph().as_default(): - with tf.Session() as sess: + with tf.compat.v1.Session() as sess: np.random.seed(seed=args.seed) @@ -69,9 +69,9 @@ def main(args): facenet.load_model(args.model) # Get input and output tensors - images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") - embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") - phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") + images_placeholder = tf.compat.v1.get_default_graph().get_tensor_by_name("input:0") + embeddings = tf.compat.v1.get_default_graph().get_tensor_by_name("embeddings:0") + phase_train_placeholder = tf.compat.v1.get_default_graph().get_tensor_by_name("phase_train:0") embedding_size = embeddings.get_shape()[1] # Run forward pass to calculate embeddings diff --git a/src/compare.py b/src/compare.py index bc53cc421..6b10f4e5d 100644 --- a/src/compare.py +++ b/src/compare.py @@ -41,15 +41,15 @@ def main(args): images = load_and_align_data(args.image_files, args.image_size, args.margin, args.gpu_memory_fraction) with tf.Graph().as_default(): - with tf.Session() as sess: + with tf.compat.v1.Session() as sess: # Load the model facenet.load_model(args.model) # Get input and output tensors - images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") - embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") - phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") + images_placeholder = tf.compat.v1.get_default_graph().get_tensor_by_name("input:0") + embeddings = tf.compat.v1.get_default_graph().get_tensor_by_name("embeddings:0") + phase_train_placeholder = tf.compat.v1.get_default_graph().get_tensor_by_name("phase_train:0") # Run forward pass to calculate embeddings feed_dict = { images_placeholder: images, phase_train_placeholder:False } @@ -84,8 +84,8 @@ def load_and_align_data(image_paths, image_size, margin, gpu_memory_fraction): print('Creating networks and loading parameters') with tf.Graph().as_default(): - gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_memory_fraction) - sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) + gpu_options = tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=gpu_memory_fraction) + sess = tf.Session(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) with sess.as_default(): pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None) diff --git a/src/facenet.py b/src/facenet.py index 0e056765a..d36ac4b55 100644 --- a/src/facenet.py +++ b/src/facenet.py @@ -52,12 +52,12 @@ def triplet_loss(anchor, positive, negative, alpha): Returns: the triplet loss according to the FaceNet paper as a float tensor. """ - with tf.variable_scope('triplet_loss'): - pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), 1) - neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), 1) + with tf.compat.v1.variable_scope('triplet_loss'): + ppos_dist = tf.reduce_sum(input_tensor=tf.square(tf.subtract(anchor, positive)), axis=1) + pos_dist = tf.reduce_sum(input_tensor=tf.square(tf.subtract(anchor, positive)), axis=1) basic_loss = tf.add(tf.subtract(pos_dist,neg_dist), alpha) - loss = tf.reduce_mean(tf.maximum(basic_loss, 0.0), 0) + loss = tf.reduce_mean(input_tensor=tf.maximum(basic_loss, 0.0), axis=0) return loss @@ -66,14 +66,14 @@ def center_loss(features, label, alfa, nrof_classes): (http://ydwen.github.io/papers/WenECCV16.pdf) """ nrof_features = features.get_shape()[1] - centers = tf.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32, - initializer=tf.constant_initializer(0), trainable=False) + centers = tf.compat.v1.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32, use_resource=False, + initializer=tf.compat.v1.constant_initializer(0), trainable=False) label = tf.reshape(label, [-1]) centers_batch = tf.gather(centers, label) diff = (1 - alfa) * (centers_batch - features) - centers = tf.scatter_sub(centers, label, diff) + centers = tf.compat.v1.scatter_sub(centers, label, diff) with tf.control_dependencies([centers]): - loss = tf.reduce_mean(tf.square(features - centers_batch)) + loss = tf.reduce_mean(input_tensor=tf.square(features - centers_batch)) return loss, centers def get_image_paths_and_labels(dataset): @@ -106,29 +106,30 @@ def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batc filenames, label, control = input_queue.dequeue() images = [] for filename in tf.unstack(filenames): - file_contents = tf.read_file(filename) + file_contents = tf.io.read_file(filename) image = tf.image.decode_image(file_contents, 3) - image = tf.cond(get_control_flag(control[0], RANDOM_ROTATE), - lambda:tf.py_func(random_rotate_image, [image], tf.uint8), - lambda:tf.identity(image)) - image = tf.cond(get_control_flag(control[0], RANDOM_CROP), - lambda:tf.random_crop(image, image_size + (3,)), - lambda:tf.image.resize_image_with_crop_or_pad(image, image_size[0], image_size[1])) - image = tf.cond(get_control_flag(control[0], RANDOM_FLIP), - lambda:tf.image.random_flip_left_right(image), - lambda:tf.identity(image)) - image = tf.cond(get_control_flag(control[0], FIXED_STANDARDIZATION), - lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, - lambda:tf.image.per_image_standardization(image)) - image = tf.cond(get_control_flag(control[0], FLIP), - lambda:tf.image.flip_left_right(image), - lambda:tf.identity(image)) + image = tf.cond(pred=get_control_flag(control[0], RANDOM_ROTATE), + true_fn=lambda:tf.compat.v1.py_func(random_rotate_image, [image], tf.uint8), + false_fn=lambda:tf.identity(image)) + image = tf.cond(pred=get_control_flag(control[0], RANDOM_CROP), + true_fn=lambda:tf.image.random_crop(image, image_size + (3,)), + false_fn=lambda:tf.image.resize_with_crop_or_pad(image, image_size[0], image_size[1])) + image = tf.cond(pred=get_control_flag(control[0], RANDOM_FLIP), + true_fn=lambda:tf.image.random_flip_left_right(image), + false_fn=lambda:tf.identity(image)) + image = tf.cast(image, tf.float32) + image = tf.cond(pred=get_control_flag(control[0], FIXED_STANDARDIZATION), + true_fn=lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, + false_fn=lambda:tf.image.per_image_standardization(image)) + image = tf.cond(pred=get_control_flag(control[0], FLIP), + true_fn=lambda:tf.image.flip_left_right(image), + false_fn=lambda:tf.identity(image)) #pylint: disable=no-member image.set_shape(image_size + (3,)) images.append(image) images_and_labels_list.append([images, label]) - image_batch, label_batch = tf.train.batch_join( + image_batch, label_batch = tf.compat.v1.train.batch_join( images_and_labels_list, batch_size=batch_size_placeholder, shapes=[image_size + (3,), ()], enqueue_many=True, capacity=4 * nrof_preprocess_threads * 100, @@ -137,7 +138,7 @@ def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batc return image_batch, label_batch def get_control_flag(control, field): - return tf.equal(tf.mod(tf.floor_div(control, field), 2), 1) + return tf.equal(tf.math.mod(tf.math.floordiv(control, field), 2), 1) def _add_loss_summaries(total_loss): """Add summaries for losses. @@ -152,7 +153,7 @@ def _add_loss_summaries(total_loss): """ # Compute the moving average of all individual losses and the total loss. loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg') - losses = tf.get_collection('losses') + losses = tf.compat.v1.get_collection('losses') loss_averages_op = loss_averages.apply(losses + [total_loss]) # Attach a scalar summmary to all individual losses and the total loss; do the @@ -160,8 +161,8 @@ def _add_loss_summaries(total_loss): for l in losses + [total_loss]: # Name each loss as '(raw)' and name the moving average version of the loss # as the original loss name. - tf.summary.scalar(l.op.name +' (raw)', l) - tf.summary.scalar(l.op.name, loss_averages.average(l)) + tf.compat.v1.summary.scalar(l.op.name +' (raw)', l) + tf.compat.v1.summary.scalar(l.op.name, loss_averages.average(l)) return loss_averages_op @@ -172,15 +173,15 @@ def train(total_loss, global_step, optimizer, learning_rate, moving_average_deca # Compute gradients. with tf.control_dependencies([loss_averages_op]): if optimizer=='ADAGRAD': - opt = tf.train.AdagradOptimizer(learning_rate) + opt = tf.compat.v1.train.AdagradOptimizer(learning_rate) elif optimizer=='ADADELTA': - opt = tf.train.AdadeltaOptimizer(learning_rate, rho=0.9, epsilon=1e-6) + opt = tf.compat.v1.train.AdadeltaOptimizer(learning_rate, rho=0.9, epsilon=1e-6) elif optimizer=='ADAM': - opt = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999, epsilon=0.1) + opt = tf.compat.v1.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999, epsilon=0.1) elif optimizer=='RMSPROP': - opt = tf.train.RMSPropOptimizer(learning_rate, decay=0.9, momentum=0.9, epsilon=1.0) + opt = tf.compat.v1.train.RMSPropOptimizer(learning_rate, decay=0.9, momentum=0.9, epsilon=1.0) elif optimizer=='MOM': - opt = tf.train.MomentumOptimizer(learning_rate, 0.9, use_nesterov=True) + opt = tf.compat.v1.train.MomentumOptimizer(learning_rate, 0.9, use_nesterov=True) else: raise ValueError('Invalid optimization algorithm') @@ -191,19 +192,19 @@ def train(total_loss, global_step, optimizer, learning_rate, moving_average_deca # Add histograms for trainable variables. if log_histograms: - for var in tf.trainable_variables(): - tf.summary.histogram(var.op.name, var) + for var in tf.compat.v1.trainable_variables(): + tf.compat.v1.summary.histogram(var.op.name, var) # Add histograms for gradients. if log_histograms: for grad, var in grads: if grad is not None: - tf.summary.histogram(var.op.name + '/gradients', grad) + tf.compat.v1.summary.histogram(var.op.name + '/gradients', grad) # Track the moving averages of all trainable variables. variable_averages = tf.train.ExponentialMovingAverage( moving_average_decay, global_step) - variables_averages_op = variable_averages.apply(tf.trainable_variables()) + variables_averages_op = variable_averages.apply(tf.compat.v1.trainable_variables()) with tf.control_dependencies([apply_gradient_op, variables_averages_op]): train_op = tf.no_op(name='train') @@ -368,7 +369,7 @@ def load_model(model, input_map=None): if (os.path.isfile(model_exp)): print('Model filename: %s' % model_exp) with gfile.FastGFile(model_exp,'rb') as f: - graph_def = tf.GraphDef() + graph_def = tf.compat.v1.GraphDef() graph_def.ParseFromString(f.read()) tf.import_graph_def(graph_def, input_map=input_map, name='') else: @@ -378,8 +379,8 @@ def load_model(model, input_map=None): print('Metagraph file: %s' % meta_file) print('Checkpoint file: %s' % ckpt_file) - saver = tf.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map) - saver.restore(tf.get_default_session(), os.path.join(model_exp, ckpt_file)) + saver = tf.compat.v1.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map) + saver.restore(tf.compat.v1.get_default_session(), os.path.join(model_exp, ckpt_file)) def get_model_filenames(model_dir): files = os.listdir(model_dir) diff --git a/src/freeze_graph.py b/src/freeze_graph.py index 3584c186e..c59517df1 100644 --- a/src/freeze_graph.py +++ b/src/freeze_graph.py @@ -37,7 +37,7 @@ def main(args): with tf.Graph().as_default(): - with tf.Session() as sess: + with tf.compat.v1.Session() as sess: # Load the model metagraph and checkpoint print('Model directory: %s' % args.model_dir) meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.model_dir)) @@ -46,10 +46,10 @@ def main(args): print('Checkpoint file: %s' % ckpt_file) model_dir_exp = os.path.expanduser(args.model_dir) - saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file), clear_devices=True) - tf.get_default_session().run(tf.global_variables_initializer()) - tf.get_default_session().run(tf.local_variables_initializer()) - saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file)) + saver = tf.compat.v1.train.import_meta_graph(os.path.join(model_dir_exp, meta_file), clear_devices=True) + tf.compat.v1.get_default_session().run(tf.compat.v1.global_variables_initializer()) + tf.compat.v1.get_default_session().run(tf.compat.v1.local_variables_initializer()) + saver.restore(tf.compat.v1.get_default_session(), os.path.join(model_dir_exp, ckpt_file)) # Retrieve the protobuf graph definition and fix the batch norm nodes input_graph_def = sess.graph.as_graph_def() @@ -58,7 +58,7 @@ def main(args): output_graph_def = freeze_graph_def(sess, input_graph_def, 'embeddings,label_batch') # Serialize and dump the output graph to the filesystem - with tf.gfile.GFile(args.output_file, 'wb') as f: + with tf.io.gfile.GFile(args.output_file, 'wb') as f: f.write(output_graph_def.SerializeToString()) print("%d ops in the final graph: %s" % (len(output_graph_def.node), args.output_file)) diff --git a/src/train_softmax.py b/src/train_softmax.py index 6b0b28b58..2e4a7d928 100644 --- a/src/train_softmax.py +++ b/src/train_softmax.py @@ -39,7 +39,7 @@ import lfw import h5py import math -import tensorflow.contrib.slim as slim +# import tensorflow.contrib.slim as slim from tensorflow.python.ops import data_flow_ops from tensorflow.python.framework import ops from tensorflow.python.ops import array_ops @@ -95,7 +95,7 @@ def main(args): lfw_paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs) with tf.Graph().as_default(): - tf.set_random_seed(args.seed) + tf.compat.v1.set_random_seed(args.seed) global_step = tf.Variable(0, trainable=False) # Get a list of image paths and their labels @@ -107,17 +107,17 @@ def main(args): # Create a queue that produces indices into the image_list and label_list labels = ops.convert_to_tensor(label_list, dtype=tf.int32) range_size = array_ops.shape(labels)[0] - index_queue = tf.train.range_input_producer(range_size, num_epochs=None, + index_queue = tf.compat.v1.train.range_input_producer(range_size, num_epochs=None, shuffle=True, seed=None, capacity=32) index_dequeue_op = index_queue.dequeue_many(args.batch_size*args.epoch_size, 'index_dequeue') - learning_rate_placeholder = tf.placeholder(tf.float32, name='learning_rate') - batch_size_placeholder = tf.placeholder(tf.int32, name='batch_size') - phase_train_placeholder = tf.placeholder(tf.bool, name='phase_train') - image_paths_placeholder = tf.placeholder(tf.string, shape=(None,1), name='image_paths') - labels_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='labels') - control_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='control') + learning_rate_placeholder = tf.compat.v1.placeholder(tf.float32, name='learning_rate') + batch_size_placeholder = tf.compat.v1.placeholder(tf.int32, name='batch_size') + phase_train_placeholder = tf.compat.v1.placeholder(tf.bool, name='phase_train') + image_paths_placeholder = tf.compat.v1.placeholder(tf.string, shape=(None,1), name='image_paths') + labels_placeholder = tf.compat.v1.placeholder(tf.int32, shape=(None,1), name='labels') + control_placeholder = tf.compat.v1.placeholder(tf.int32, shape=(None,1), name='control') nrof_preprocess_threads = 4 input_queue = data_flow_ops.FIFOQueue(capacity=2000000, @@ -143,57 +143,57 @@ def main(args): prelogits, _ = network.inference(image_batch, args.keep_probability, phase_train=phase_train_placeholder, bottleneck_layer_size=args.embedding_size, weight_decay=args.weight_decay) - logits = slim.fully_connected(prelogits, len(train_set), activation_fn=None, - weights_initializer=slim.initializers.xavier_initializer(), - weights_regularizer=slim.l2_regularizer(args.weight_decay), + logits = tf.keras.layers.Dense(prelogits, len(train_set), activation_fn=None, + weights_initializer=tf.initializers.glorot_uniform(), + weights_regularizer=tf.keras.regularizers.l2(0.5 * (args.weight_decay)), scope='Logits', reuse=False) embeddings = tf.nn.l2_normalize(prelogits, 1, 1e-10, name='embeddings') # Norm for the prelogits eps = 1e-4 - prelogits_norm = tf.reduce_mean(tf.norm(tf.abs(prelogits)+eps, ord=args.prelogits_norm_p, axis=1)) - tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, prelogits_norm * args.prelogits_norm_loss_factor) + prelogits_norm = tf.reduce_mean(input_tensor=tf.norm(tensor=tf.abs(prelogits)+eps, ord=args.prelogits_norm_p, axis=1)) + tf.compat.v1.add_to_collection(tf.compat.v1.GraphKeys.REGULARIZATION_LOSSES, prelogits_norm * args.prelogits_norm_loss_factor) # Add center loss prelogits_center_loss, _ = facenet.center_loss(prelogits, label_batch, args.center_loss_alfa, nrof_classes) - tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, prelogits_center_loss * args.center_loss_factor) + tf.compat.v1.add_to_collection(tf.compat.v1.GraphKeys.REGULARIZATION_LOSSES, prelogits_center_loss * args.center_loss_factor) - learning_rate = tf.train.exponential_decay(learning_rate_placeholder, global_step, + learning_rate = tf.compat.v1.train.exponential_decay(learning_rate_placeholder, global_step, args.learning_rate_decay_epochs*args.epoch_size, args.learning_rate_decay_factor, staircase=True) - tf.summary.scalar('learning_rate', learning_rate) + tf.compat.v1.summary.scalar('learning_rate', learning_rate) # Calculate the average cross entropy loss across the batch cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits( labels=label_batch, logits=logits, name='cross_entropy_per_example') - cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy') - tf.add_to_collection('losses', cross_entropy_mean) + cross_entropy_mean = tf.reduce_mean(input_tensor=cross_entropy, name='cross_entropy') + tf.compat.v1.add_to_collection('losses', cross_entropy_mean) - correct_prediction = tf.cast(tf.equal(tf.argmax(logits, 1), tf.cast(label_batch, tf.int64)), tf.float32) - accuracy = tf.reduce_mean(correct_prediction) + correct_prediction = tf.cast(tf.equal(tf.argmax(input=logits, axis=1), tf.cast(label_batch, tf.int64)), tf.float32) + accuracy = tf.reduce_mean(input_tensor=correct_prediction) # Calculate the total losses - regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) + regularization_losses = tf.compat.v1.get_collection(tf.compat.v1.GraphKeys.REGULARIZATION_LOSSES) total_loss = tf.add_n([cross_entropy_mean] + regularization_losses, name='total_loss') # Build a Graph that trains the model with one batch of examples and updates the model parameters train_op = facenet.train(total_loss, global_step, args.optimizer, - learning_rate, args.moving_average_decay, tf.global_variables(), args.log_histograms) + learning_rate, args.moving_average_decay, tf.compat.v1.global_variables(), args.log_histograms) # Create a saver - saver = tf.train.Saver(tf.trainable_variables(), max_to_keep=3) + saver = tf.compat.v1.train.Saver(tf.compat.v1.trainable_variables(), max_to_keep=3) # Build the summary operation based on the TF collection of Summaries. - summary_op = tf.summary.merge_all() + summary_op = tf.compat.v1.summary.merge_all() # Start running operations on the Graph. - gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) - sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) - sess.run(tf.global_variables_initializer()) - sess.run(tf.local_variables_initializer()) - summary_writer = tf.summary.FileWriter(log_dir, sess.graph) + gpu_options = tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) + sess = tf.compat.v1.Session(config=tf.compat.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) + sess.run(tf.compat.v1.global_variables_initializer()) + sess.run(tf.compat.v1.local_variables_initializer()) + summary_writer = tf.compat.v1.summary.FileWriter(log_dir, sess.graph) coord = tf.train.Coordinator() - tf.train.start_queue_runners(coord=coord, sess=sess) + tf.compat.v1.train.start_queue_runners(coord=coord, sess=sess) with sess.as_default(): @@ -347,7 +347,7 @@ def train(args, sess, epoch, image_list, label_list, index_dequeue_op, enqueue_o batch_number += 1 train_time += duration # Add validation loss and accuracy to summary - summary = tf.Summary() + summary = tf.compat.v1.Summary() #pylint: disable=maybe-no-member summary.value.add(tag='time/total', simple_value=train_time) summary_writer.add_summary(summary, global_step=step_) @@ -443,7 +443,7 @@ def evaluate(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phas print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far)) lfw_time = time.time() - start_time # Add validation loss and accuracy to summary - summary = tf.Summary() + summary = tf.compat.v1.Summary() #pylint: disable=maybe-no-member summary.value.add(tag='lfw/accuracy', simple_value=np.mean(accuracy)) summary.value.add(tag='lfw/val_rate', simple_value=val) @@ -470,7 +470,7 @@ def save_variables_and_metagraph(sess, saver, summary_writer, model_dir, model_n saver.export_meta_graph(metagraph_filename) save_time_metagraph = time.time() - start_time print('Metagraph saved in %.2f seconds' % save_time_metagraph) - summary = tf.Summary() + summary = tf.compat.v1.Summary() #pylint: disable=maybe-no-member summary.value.add(tag='time/save_variables', simple_value=save_time_variables) summary.value.add(tag='time/save_metagraph', simple_value=save_time_metagraph) diff --git a/src/validate_on_lfw.py b/src/validate_on_lfw.py index ac456c5f6..f3eea4427 100644 --- a/src/validate_on_lfw.py +++ b/src/validate_on_lfw.py @@ -45,7 +45,7 @@ def main(args): with tf.Graph().as_default(): - with tf.Session() as sess: + with tf.compat.v1.Session() as sess: # Read the file containing the pairs used for testing pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs)) @@ -53,11 +53,11 @@ def main(args): # Get the paths for the corresponding images paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs) - image_paths_placeholder = tf.placeholder(tf.string, shape=(None,1), name='image_paths') - labels_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='labels') - batch_size_placeholder = tf.placeholder(tf.int32, name='batch_size') - control_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='control') - phase_train_placeholder = tf.placeholder(tf.bool, name='phase_train') + image_paths_placeholder = tf.compat.v1.placeholder(tf.string, shape=(None,1), name='image_paths') + labels_placeholder = tf.compat.v1.placeholder(tf.int32, shape=(None,1), name='labels') + batch_size_placeholder = tf.compat.v1.placeholder(tf.int32, name='batch_size') + control_placeholder = tf.compat.v1.placeholder(tf.int32, shape=(None,1), name='control') + phase_train_placeholder = tf.compat.v1.placeholder(tf.bool, name='phase_train') nrof_preprocess_threads = 4 image_size = (args.image_size, args.image_size) @@ -73,10 +73,10 @@ def main(args): facenet.load_model(args.model, input_map=input_map) # Get output tensor - embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") + embeddings = tf.compat.v1.get_default_graph().get_tensor_by_name("embeddings:0") # coord = tf.train.Coordinator() - tf.train.start_queue_runners(coord=coord, sess=sess) + tf.compat.v1.train.start_queue_runners(coord=coord, sess=sess) evaluate(sess, eval_enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, embeddings, label_batch, paths, actual_issame, args.lfw_batch_size, args.lfw_nrof_folds, args.distance_metric, args.subtract_mean, diff --git a/tmp/align_dataset.m b/tmp/align_dataset.m deleted file mode 100644 index 1e1fce089..000000000 --- a/tmp/align_dataset.m +++ /dev/null @@ -1,178 +0,0 @@ -# MIT License -# -# Copyright (c) 2016 David Sandberg -# -# Permission is hereby granted, free of charge, to any person obtaining a copy -# of this software and associated documentation files (the "Software"), to deal -# in the Software without restriction, including without limitation the rights -# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -# copies of the Software, and to permit persons to whom the Software is -# furnished to do so, subject to the following conditions: -# -# The above copyright notice and this permission notice shall be included in all -# copies or substantial portions of the Software. -# -# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -# SOFTWARE. - -% LFW -% source_path = '/home/david/datasets/lfw/raw'; -% target_path = '/home/david/datasets/lfw/lfw_mtcnnalign_160'; -% image_size = 160 + 0; -% margin = round(image_size*0.2) + 0; - -% FaceScrub -% source_path = '/home/david/datasets/facescrub/facescrub/'; -% target_path = '/home/david/datasets/facescrub/facescrub_mtcnnalign_182_160'; -% failed_images_list = '/home/david/datasets/facescrub/facescrub_mtcnnalign_182_160/failed_images.txt'; -% image_size = 160 + 12; -% margin = round(image_size*0.2) + 12; - -source_path = '/home/david/datasets/casia/CASIA-maxpy-clean/'; -target_path = '/home/david/datasets/casia/casia_maxpy_mtcnnalign_182_160'; -failed_images_list = '/home/david/datasets/casia/casia_maxpy_mtcnnalign_182_160/failed_images.txt'; -image_size = 160 + 12; -margin = round(image_size*0.2) + 12; - -image_extension = 'png'; -minsize=20; %minimum size of face -use_new = 0; - -caffe_path='/home/david/repo2/caffe/matlab'; -pdollar_toolbox_path='/home/david/repo2/toolbox'; -if use_new - caffe_model_path='/home/david/repo2/MTCNN_face_detection_alignment/code/codes/MTCNNv2/model'; -else - caffe_model_path='/home/david/repo2/MTCNN_face_detection_alignment/code/codes/MTCNNv1/model'; -end; -addpath(genpath(caffe_path)); -addpath(genpath(pdollar_toolbox_path)); - -caffe.set_mode_gpu(); -caffe.set_device(0); - -%three steps's threshold -threshold=[0.6 0.7 0.7]; - -%scale factor -factor=0.709; - -%load caffe models -if use_new - prototxt_dir = strcat(caffe_model_path,'/det4.prototxt'); - model_dir = strcat(caffe_model_path,'/det4.caffemodel'); -end; -%faces=cell(0); - -k = 0; -classes = dir(source_path); -%classes = classes(randperm(length(classes))); -for i=1:length(classes), - if classes(i).name(1)~='.' - source_class_path = sprintf('%s/%s', source_path, classes(i).name); - target_class_path = sprintf('%s/%s', target_path, classes(i).name); - imgs = dir(source_class_path); - %imgs = imgs(randperm(length(imgs))); - if ~exist(target_class_path, 'dir'), - mkdir(target_class_path); - end; - for j=1:length(imgs), - if imgs(j).isdir==0 - [pathstr,name,ext] = fileparts(imgs(j).name); - target_img_path = sprintf('%s/%s.%s', target_class_path, name, image_extension); - if ~exist(target_img_path,'file') && any([ strcmpi(ext,'.jpg') strcmpi(ext,'.jpeg') strcmpi(ext,'.png') strcmpi(ext,'.gif') ]) - if mod(k,1000)==0 - fprintf('Resetting GPU\n'); - caffe.reset_all(); - caffe.set_mode_gpu(); - caffe.set_device(0); - prototxt_dir = strcat(caffe_model_path,'/det1.prototxt'); - model_dir = strcat(caffe_model_path,'/det1.caffemodel'); - PNet=caffe.Net(prototxt_dir,model_dir,'test'); - prototxt_dir = strcat(caffe_model_path,'/det2.prototxt'); - model_dir = strcat(caffe_model_path,'/det2.caffemodel'); - RNet=caffe.Net(prototxt_dir,model_dir,'test'); - prototxt_dir = strcat(caffe_model_path,'/det3.prototxt'); - model_dir = strcat(caffe_model_path,'/det3.caffemodel'); - ONet=caffe.Net(prototxt_dir,model_dir,'test'); - if use_new - prototxt_dir = strcat(caffe_model_path,'/det4.prototxt'); - model_dir = strcat(caffe_model_path,'/det4.caffemodel'); - LNet=caffe.Net(prototxt_dir,model_dir,'test'); - end; - end; - - source_img_path = sprintf('%s/%s', source_class_path, imgs(j).name); - % source_img_path = '/home/david/datasets/facescrub/facescrub//Billy_Zane/095f83fefdf1dc493c013edb1ef860001193e8d9.jpg' - try - img = imread(source_img_path); - catch exception - fprintf('Unexpected error (%s): %s\n', exception.identifier, exception.message); - continue; - end; - fprintf('%6d: %s\n', k, source_img_path); - if length(size(img))<3 - img = repmat(img,[1,1,3]); - end; - img_size = size(img); % [height, width, channels] - img_size = fliplr(img_size(1:2)); % [x,y] - if use_new - [boundingboxes, points]=detect_face_v2(img,minsize,PNet,RNet,ONet,LNet,threshold,false,factor); - else - [boundingboxes, points]=detect_face_v1(img,minsize,PNet,RNet,ONet,threshold,false,factor); - end; - nrof_faces = size(boundingboxes,1); - det = boundingboxes; - if nrof_faces>0 - if nrof_faces>1 - % select the faces with the largest bounding box - % closest to the image center - bounding_box_size = (det(:,3)-det(:,1)).*(det(:,4)-det(:,2)); - img_center = img_size / 2; - offsets = [ (det(:,1)+det(:,3))/2 (det(:,2)+det(:,4))/2 ] - ones(nrof_faces,1)*img_center; - offset_dist_squared = sum(offsets.^2,2); - [a, index] = max(bounding_box_size-offset_dist_squared*2.0); % some extra weight on the centering - det = det(index,:); - points = points(:,index); - end; -% if nrof_faces>0 -% figure(1); clf; -% imshow(img); -% hold on; -% plot(points(1:5,1),points(6:10,1),'g.','MarkerSize',10); -% bb = round(det(1:4)); -% rectangle('Position',[bb(1) bb(2) bb(3)-bb(1) bb(4)-bb(2)],'LineWidth',2,'LineStyle','-') -% xxx = 1; -% end; - det(1) = max(det(1)-margin/2, 1); - det(2) = max(det(2)-margin/2, 1); - det(3) = min(det(3)+margin/2, img_size(1)); - det(4) = min(det(4)+margin/2, img_size(2)); - det(1:4) = round(det(1:4)); - - img = img(det(2):det(4),det(1):det(3),:); - img = imresize(img, [image_size, image_size]); - - imwrite(img, target_img_path); - k = k + 1; - else - fprintf('Detection failed: %s\n', source_img_path); - fid = fopen(failed_images_list,'at'); - if fid>=0 - fprintf(fid, '%s\n', source_img_path); - fclose(fid); - end; - end; - if mod(k,100)==0 - xxx = 1; - end; - end; - end; - end; - end; -end; diff --git a/tmp/detect_face_v1.m b/tmp/detect_face_v1.m deleted file mode 100644 index 4aeb66239..000000000 --- a/tmp/detect_face_v1.m +++ /dev/null @@ -1,253 +0,0 @@ -% MIT License -% -% Copyright (c) 2016 Kaipeng Zhang -% -% Permission is hereby granted, free of charge, to any person obtaining a copy -% of this software and associated documentation files (the "Software"), to deal -% in the Software without restriction, including without limitation the rights -% to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -% copies of the Software, and to permit persons to whom the Software is -% furnished to do so, subject to the following conditions: -% -% The above copyright notice and this permission notice shall be included in all -% copies or substantial portions of the Software. -% -% THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -% IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -% FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -% AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -% LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -% OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -% SOFTWARE. - -function [total_boxes, points] = detect_face_v1(img,minsize,PNet,RNet,ONet,threshold,fastresize,factor) - %im: input image - %minsize: minimum of faces' size - %pnet, rnet, onet: caffemodel - %threshold: threshold=[th1 th2 th3], th1-3 are three steps's threshold - %fastresize: resize img from last scale (using in high-resolution images) if fastresize==true - factor_count=0; - total_boxes=[]; - points=[]; - h=size(img,1); - w=size(img,2); - minl=min([w h]); - img=single(img); - if fastresize - im_data=(single(img)-127.5)*0.0078125; - end - m=12/minsize; - minl=minl*m; - %creat scale pyramid - scales=[]; - while (minl>=12) - scales=[scales m*factor^(factor_count)]; - minl=minl*factor; - factor_count=factor_count+1; - end - %first stage - for j = 1:size(scales,2) - scale=scales(j); - hs=ceil(h*scale); - ws=ceil(w*scale); - if fastresize - im_data=imResample(im_data,[hs ws],'bilinear'); - else - im_data=(imResample(img,[hs ws],'bilinear')-127.5)*0.0078125; - end - PNet.blobs('data').reshape([hs ws 3 1]); - out=PNet.forward({im_data}); - boxes=generateBoundingBox(out{2}(:,:,2),out{1},scale,threshold(1)); - %inter-scale nms - pick=nms(boxes,0.5,'Union'); - boxes=boxes(pick,:); - if ~isempty(boxes) - total_boxes=[total_boxes;boxes]; - end - end - numbox=size(total_boxes,1); - if ~isempty(total_boxes) - pick=nms(total_boxes,0.7,'Union'); - total_boxes=total_boxes(pick,:); - regw=total_boxes(:,3)-total_boxes(:,1); - regh=total_boxes(:,4)-total_boxes(:,2); - total_boxes=[total_boxes(:,1)+total_boxes(:,6).*regw total_boxes(:,2)+total_boxes(:,7).*regh total_boxes(:,3)+total_boxes(:,8).*regw total_boxes(:,4)+total_boxes(:,9).*regh total_boxes(:,5)]; - total_boxes=rerec(total_boxes); - total_boxes(:,1:4)=fix(total_boxes(:,1:4)); - [dy edy dx edx y ey x ex tmpw tmph]=pad(total_boxes,w,h); - end - numbox=size(total_boxes,1); - if numbox>0 - %second stage - tempimg=zeros(24,24,3,numbox); - for k=1:numbox - tmp=zeros(tmph(k),tmpw(k),3); - tmp(dy(k):edy(k),dx(k):edx(k),:)=img(y(k):ey(k),x(k):ex(k),:); - if size(tmp,1)>0 && size(tmp,2)>0 || size(tmp,1)==0 && size(tmp,2)==0 - tempimg(:,:,:,k)=imResample(tmp,[24 24],'bilinear'); - else - total_boxes = []; - return; - end; - end - tempimg=(tempimg-127.5)*0.0078125; - RNet.blobs('data').reshape([24 24 3 numbox]); - out=RNet.forward({tempimg}); - score=squeeze(out{2}(2,:)); - pass=find(score>threshold(2)); - total_boxes=[total_boxes(pass,1:4) score(pass)']; - mv=out{1}(:,pass); - if size(total_boxes,1)>0 - pick=nms(total_boxes,0.7,'Union'); - total_boxes=total_boxes(pick,:); - total_boxes=bbreg(total_boxes,mv(:,pick)'); - total_boxes=rerec(total_boxes); - end - numbox=size(total_boxes,1); - if numbox>0 - %third stage - total_boxes=fix(total_boxes); - [dy edy dx edx y ey x ex tmpw tmph]=pad(total_boxes,w,h); - tempimg=zeros(48,48,3,numbox); - for k=1:numbox - tmp=zeros(tmph(k),tmpw(k),3); - tmp(dy(k):edy(k),dx(k):edx(k),:)=img(y(k):ey(k),x(k):ex(k),:); - if size(tmp,1)>0 && size(tmp,2)>0 || size(tmp,1)==0 && size(tmp,2)==0 - tempimg(:,:,:,k)=imResample(tmp,[48 48],'bilinear'); - else - total_boxes = []; - return; - end; - end - tempimg=(tempimg-127.5)*0.0078125; - ONet.blobs('data').reshape([48 48 3 numbox]); - out=ONet.forward({tempimg}); - score=squeeze(out{3}(2,:)); - points=out{2}; - pass=find(score>threshold(3)); - points=points(:,pass); - total_boxes=[total_boxes(pass,1:4) score(pass)']; - mv=out{1}(:,pass); - w=total_boxes(:,3)-total_boxes(:,1)+1; - h=total_boxes(:,4)-total_boxes(:,2)+1; - points(1:5,:)=repmat(w',[5 1]).*points(1:5,:)+repmat(total_boxes(:,1)',[5 1])-1; - points(6:10,:)=repmat(h',[5 1]).*points(6:10,:)+repmat(total_boxes(:,2)',[5 1])-1; - if size(total_boxes,1)>0 - total_boxes=bbreg(total_boxes,mv(:,:)'); - pick=nms(total_boxes,0.7,'Min'); - total_boxes=total_boxes(pick,:); - points=points(:,pick); - end - end - end -end - -function [boundingbox] = bbreg(boundingbox,reg) - %calibrate bouding boxes - if size(reg,2)==1 - reg=reshape(reg,[size(reg,3) size(reg,4)])'; - end - w=[boundingbox(:,3)-boundingbox(:,1)]+1; - h=[boundingbox(:,4)-boundingbox(:,2)]+1; - boundingbox(:,1:4)=[boundingbox(:,1)+reg(:,1).*w boundingbox(:,2)+reg(:,2).*h boundingbox(:,3)+reg(:,3).*w boundingbox(:,4)+reg(:,4).*h]; -end - -function [boundingbox reg] = generateBoundingBox(map,reg,scale,t) - %use heatmap to generate bounding boxes - stride=2; - cellsize=12; - boundingbox=[]; - map=map'; - dx1=reg(:,:,1)'; - dy1=reg(:,:,2)'; - dx2=reg(:,:,3)'; - dy2=reg(:,:,4)'; - [y x]=find(map>=t); - a=find(map>=t); - if size(y,1)==1 - y=y';x=x';score=map(a)';dx1=dx1';dy1=dy1';dx2=dx2';dy2=dy2'; - else - score=map(a); - end - reg=[dx1(a) dy1(a) dx2(a) dy2(a)]; - if isempty(reg) - reg=reshape([],[0 3]); - end - boundingbox=[y x]; - boundingbox=[fix((stride*(boundingbox-1)+1)/scale) fix((stride*(boundingbox-1)+cellsize-1+1)/scale) score reg]; -end - -function pick = nms(boxes,threshold,type) - %NMS - if isempty(boxes) - pick = []; - return; - end - x1 = boxes(:,1); - y1 = boxes(:,2); - x2 = boxes(:,3); - y2 = boxes(:,4); - s = boxes(:,5); - area = (x2-x1+1) .* (y2-y1+1); - [vals, I] = sort(s); - pick = s*0; - counter = 1; - while ~isempty(I) - last = length(I); - i = I(last); - pick(counter) = i; - counter = counter + 1; - xx1 = max(x1(i), x1(I(1:last-1))); - yy1 = max(y1(i), y1(I(1:last-1))); - xx2 = min(x2(i), x2(I(1:last-1))); - yy2 = min(y2(i), y2(I(1:last-1))); - w = max(0.0, xx2-xx1+1); - h = max(0.0, yy2-yy1+1); - inter = w.*h; - if strcmp(type,'Min') - o = inter ./ min(area(i),area(I(1:last-1))); - else - o = inter ./ (area(i) + area(I(1:last-1)) - inter); - end - I = I(find(o<=threshold)); - end - pick = pick(1:(counter-1)); -end - -function [dy edy dx edx y ey x ex tmpw tmph] = pad(total_boxes,w,h) - %compute the padding coordinates (pad the bounding boxes to square) - tmpw=total_boxes(:,3)-total_boxes(:,1)+1; - tmph=total_boxes(:,4)-total_boxes(:,2)+1; - numbox=size(total_boxes,1); - - dx=ones(numbox,1);dy=ones(numbox,1); - edx=tmpw;edy=tmph; - - x=total_boxes(:,1);y=total_boxes(:,2); - ex=total_boxes(:,3);ey=total_boxes(:,4); - - tmp=find(ex>w); - edx(tmp)=-ex(tmp)+w+tmpw(tmp);ex(tmp)=w; - - tmp=find(ey>h); - edy(tmp)=-ey(tmp)+h+tmph(tmp);ey(tmp)=h; - - tmp=find(x<1); - dx(tmp)=2-x(tmp);x(tmp)=1; - - tmp=find(y<1); - dy(tmp)=2-y(tmp);y(tmp)=1; -end - -function [bboxA] = rerec(bboxA) - %convert bboxA to square - bboxB=bboxA(:,1:4); - h=bboxA(:,4)-bboxA(:,2); - w=bboxA(:,3)-bboxA(:,1); - l=max([w h]')'; - bboxA(:,1)=bboxA(:,1)+w.*0.5-l.*0.5; - bboxA(:,2)=bboxA(:,2)+h.*0.5-l.*0.5; - bboxA(:,3:4)=bboxA(:,1:2)+repmat(l,[1 2]); -end - - diff --git a/tmp/detect_face_v2.m b/tmp/detect_face_v2.m deleted file mode 100644 index 3ed07b17e..000000000 --- a/tmp/detect_face_v2.m +++ /dev/null @@ -1,288 +0,0 @@ -% MIT License -% -% Copyright (c) 2016 Kaipeng Zhang -% -% Permission is hereby granted, free of charge, to any person obtaining a copy -% of this software and associated documentation files (the "Software"), to deal -% in the Software without restriction, including without limitation the rights -% to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -% copies of the Software, and to permit persons to whom the Software is -% furnished to do so, subject to the following conditions: -% -% The above copyright notice and this permission notice shall be included in all -% copies or substantial portions of the Software. -% -% THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -% IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -% FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -% AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -% LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -% OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -% SOFTWARE. - -function [total_boxes, points] = detect_face_v2(img,minsize,PNet,RNet,ONet,LNet,threshold,fastresize,factor) - %im: input image - %minsize: minimum of faces' size - %pnet, rnet, onet: caffemodel - %threshold: threshold=[th1 th2 th3], th1-3 are three steps's threshold - %fastresize: resize img from last scale (using in high-resolution images) if fastresize==true - factor_count=0; - total_boxes=[]; - points=[]; - h=size(img,1); - w=size(img,2); - minl=min([w h]); - img=single(img); - if fastresize - im_data=(single(img)-127.5)*0.0078125; - end - m=12/minsize; - minl=minl*m; - %creat scale pyramid - scales=[]; - while (minl>=12) - scales=[scales m*factor^(factor_count)]; - minl=minl*factor; - factor_count=factor_count+1; - end - %first stage - for j = 1:size(scales,2) - scale=scales(j); - hs=ceil(h*scale); - ws=ceil(w*scale); - if fastresize - im_data=imResample(im_data,[hs ws],'bilinear'); - else - im_data=(imResample(img,[hs ws],'bilinear')-127.5)*0.0078125; - end - PNet.blobs('data').reshape([hs ws 3 1]); - out=PNet.forward({im_data}); - boxes=generateBoundingBox(out{2}(:,:,2),out{1},scale,threshold(1)); - %inter-scale nms - pick=nms(boxes,0.5,'Union'); - boxes=boxes(pick,:); - if ~isempty(boxes) - total_boxes=[total_boxes;boxes]; - end - end - numbox=size(total_boxes,1); - if ~isempty(total_boxes) - pick=nms(total_boxes,0.7,'Union'); - total_boxes=total_boxes(pick,:); - bbw=total_boxes(:,3)-total_boxes(:,1); - bbh=total_boxes(:,4)-total_boxes(:,2); - total_boxes=[total_boxes(:,1)+total_boxes(:,6).*bbw total_boxes(:,2)+total_boxes(:,7).*bbh total_boxes(:,3)+total_boxes(:,8).*bbw total_boxes(:,4)+total_boxes(:,9).*bbh total_boxes(:,5)]; - total_boxes=rerec(total_boxes); - total_boxes(:,1:4)=fix(total_boxes(:,1:4)); - [dy edy dx edx y ey x ex tmpw tmph]=pad(total_boxes,w,h); - end - numbox=size(total_boxes,1); - if numbox>0 - %second stage - tempimg=zeros(24,24,3,numbox); - for k=1:numbox - tmp=zeros(tmph(k),tmpw(k),3); - tmp(dy(k):edy(k),dx(k):edx(k),:)=img(y(k):ey(k),x(k):ex(k),:); - tempimg(:,:,:,k)=imResample(tmp,[24 24],'bilinear'); - end - tempimg=(tempimg-127.5)*0.0078125; - RNet.blobs('data').reshape([24 24 3 numbox]); - out=RNet.forward({tempimg}); - score=squeeze(out{2}(2,:)); - pass=find(score>threshold(2)); - total_boxes=[total_boxes(pass,1:4) score(pass)']; - mv=out{1}(:,pass); - if size(total_boxes,1)>0 - pick=nms(total_boxes,0.7,'Union'); - total_boxes=total_boxes(pick,:); - total_boxes=bbreg(total_boxes,mv(:,pick)'); - total_boxes=rerec(total_boxes); - end - numbox=size(total_boxes,1); - if numbox>0 - %third stage - total_boxes=fix(total_boxes); - [dy edy dx edx y ey x ex tmpw tmph]=pad(total_boxes,w,h); - tempimg=zeros(48,48,3,numbox); - for k=1:numbox - tmp=zeros(tmph(k),tmpw(k),3); - tmp(dy(k):edy(k),dx(k):edx(k),:)=img(y(k):ey(k),x(k):ex(k),:); - tempimg(:,:,:,k)=imResample(tmp,[48 48],'bilinear'); - end - tempimg=(tempimg-127.5)*0.0078125; - ONet.blobs('data').reshape([48 48 3 numbox]); - out=ONet.forward({tempimg}); - score=squeeze(out{3}(2,:)); - points=out{2}; - pass=find(score>threshold(3)); - points=points(:,pass); - total_boxes=[total_boxes(pass,1:4) score(pass)']; - mv=out{1}(:,pass); - bbw=total_boxes(:,3)-total_boxes(:,1)+1; - bbh=total_boxes(:,4)-total_boxes(:,2)+1; - points(1:5,:)=repmat(bbw',[5 1]).*points(1:5,:)+repmat(total_boxes(:,1)',[5 1])-1; - points(6:10,:)=repmat(bbh',[5 1]).*points(6:10,:)+repmat(total_boxes(:,2)',[5 1])-1; - if size(total_boxes,1)>0 - total_boxes=bbreg(total_boxes,mv(:,:)'); - pick=nms(total_boxes,0.7,'Min'); - total_boxes=total_boxes(pick,:); - points=points(:,pick); - end - end - numbox=size(total_boxes,1); - %extended stage - if numbox>0 - tempimg=zeros(24,24,15,numbox); - patchw=max([total_boxes(:,3)-total_boxes(:,1)+1 total_boxes(:,4)-total_boxes(:,2)+1]'); - patchw=fix(0.25*patchw); - tmp=find(mod(patchw,2)==1); - patchw(tmp)=patchw(tmp)+1; - pointx=ones(numbox,5); - pointy=ones(numbox,5); - for k=1:5 - tmp=[points(k,:);points(k+5,:)]; - x=fix(tmp(1,:)-0.5*patchw); - y=fix(tmp(2,:)-0.5*patchw); - [dy edy dx edx y ey x ex tmpw tmph]=pad([x' y' x'+patchw' y'+patchw'],w,h); - for j=1:numbox - tmpim=zeros(tmpw(j),tmpw(j),3); - tmpim(dy(j):edy(j),dx(j):edx(j),:)=img(y(j):ey(j),x(j):ex(j),:); - tempimg(:,:,(k-1)*3+1:(k-1)*3+3,j)=imResample(tmpim,[24 24],'bilinear'); - end - end - LNet.blobs('data').reshape([24 24 15 numbox]); - tempimg=(tempimg-127.5)*0.0078125; - out=LNet.forward({tempimg}); - score=squeeze(out{3}(2,:)); - for k=1:5 - tmp=[points(k,:);points(k+5,:)]; - %do not make a large movement - temp=find(abs(out{k}(1,:)-0.5)>0.35); - if ~isempty(temp) - l=length(temp); - out{k}(:,temp)=ones(2,l)*0.5; - end - temp=find(abs(out{k}(2,:)-0.5)>0.35); - if ~isempty(temp) - l=length(temp); - out{k}(:,temp)=ones(2,l)*0.5; - end - pointx(:,k)=(tmp(1,:)-0.5*patchw+out{k}(1,:).*patchw)'; - pointy(:,k)=(tmp(2,:)-0.5*patchw+out{k}(2,:).*patchw)'; - end - for j=1:numbox - points(:,j)=[pointx(j,:)';pointy(j,:)']; - end - end - end -end - -function [boundingbox] = bbreg(boundingbox,reg) - %calibrate bouding boxes - if size(reg,2)==1 - reg=reshape(reg,[size(reg,3) size(reg,4)])'; - end - w=[boundingbox(:,3)-boundingbox(:,1)]+1; - h=[boundingbox(:,4)-boundingbox(:,2)]+1; - boundingbox(:,1:4)=[boundingbox(:,1)+reg(:,1).*w boundingbox(:,2)+reg(:,2).*h boundingbox(:,3)+reg(:,3).*w boundingbox(:,4)+reg(:,4).*h]; -end - -function [boundingbox reg] = generateBoundingBox(map,reg,scale,t) - %use heatmap to generate bounding boxes - stride=2; - cellsize=12; - boundingbox=[]; - map=map'; - dx1=reg(:,:,1)'; - dy1=reg(:,:,2)'; - dx2=reg(:,:,3)'; - dy2=reg(:,:,4)'; - [y x]=find(map>=t); - a=find(map>=t); - if size(y,1)==1 - y=y';x=x';score=map(a)';dx1=dx1';dy1=dy1';dx2=dx2';dy2=dy2'; - else - score=map(a); - end - reg=[dx1(a) dy1(a) dx2(a) dy2(a)]; - if isempty(reg) - reg=reshape([],[0 3]); - end - boundingbox=[y x]; - boundingbox=[fix((stride*(boundingbox-1)+1)/scale) fix((stride*(boundingbox-1)+cellsize-1+1)/scale) score reg]; -end - -function pick = nms(boxes,threshold,type) - %NMS - if isempty(boxes) - pick = []; - return; - end - x1 = boxes(:,1); - y1 = boxes(:,2); - x2 = boxes(:,3); - y2 = boxes(:,4); - s = boxes(:,5); - area = (x2-x1+1) .* (y2-y1+1); - [vals, I] = sort(s); - pick = s*0; - counter = 1; - while ~isempty(I) - last = length(I); - i = I(last); - pick(counter) = i; - counter = counter + 1; - xx1 = max(x1(i), x1(I(1:last-1))); - yy1 = max(y1(i), y1(I(1:last-1))); - xx2 = min(x2(i), x2(I(1:last-1))); - yy2 = min(y2(i), y2(I(1:last-1))); - w = max(0.0, xx2-xx1+1); - h = max(0.0, yy2-yy1+1); - inter = w.*h; - if strcmp(type,'Min') - o = inter ./ min(area(i),area(I(1:last-1))); - else - o = inter ./ (area(i) + area(I(1:last-1)) - inter); - end - I = I(find(o<=threshold)); - end - pick = pick(1:(counter-1)); -end - -function [dy edy dx edx y ey x ex tmpw tmph] = pad(total_boxes,w,h) - %compute the padding coordinates (pad the bounding boxes to square) - tmpw=total_boxes(:,3)-total_boxes(:,1)+1; - tmph=total_boxes(:,4)-total_boxes(:,2)+1; - numbox=size(total_boxes,1); - - dx=ones(numbox,1);dy=ones(numbox,1); - edx=tmpw;edy=tmph; - - x=total_boxes(:,1);y=total_boxes(:,2); - ex=total_boxes(:,3);ey=total_boxes(:,4); - - tmp=find(ex>w); - edx(tmp)=-ex(tmp)+w+tmpw(tmp);ex(tmp)=w; - - tmp=find(ey>h); - edy(tmp)=-ey(tmp)+h+tmph(tmp);ey(tmp)=h; - - tmp=find(x<1); - dx(tmp)=2-x(tmp);x(tmp)=1; - - tmp=find(y<1); - dy(tmp)=2-y(tmp);y(tmp)=1; -end - -function [bboxA] = rerec(bboxA) - %convert bboxA to square - bboxB=bboxA(:,1:4); - h=bboxA(:,4)-bboxA(:,2); - w=bboxA(:,3)-bboxA(:,1); - l=max([w h]')'; - bboxA(:,1)=bboxA(:,1)+w.*0.5-l.*0.5; - bboxA(:,2)=bboxA(:,2)+h.*0.5-l.*0.5; - bboxA(:,3:4)=bboxA(:,1:2)+repmat(l,[1 2]); -end - - diff --git a/util/plot_learning_curves.m b/util/plot_learning_curves.m deleted file mode 100644 index c0f24a344..000000000 --- a/util/plot_learning_curves.m +++ /dev/null @@ -1,300 +0,0 @@ -% Plots the lerning curves for the specified training runs from data in the -% file "lfw_result.txt" stored in the log directory for the respective -% model. - -% MIT License -% -% Copyright (c) 2016 David Sandberg -% -% Permission is hereby granted, free of charge, to any person obtaining a copy -% of this software and associated documentation files (the "Software"), to deal -% in the Software without restriction, including without limitation the rights -% to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -% copies of the Software, and to permit persons to whom the Software is -% furnished to do so, subject to the following conditions: -% -% The above copyright notice and this permission notice shall be included in all -% copies or substantial portions of the Software. -% -% THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -% IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -% FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -% AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -% LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -% OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -% SOFTWARE. - -%% -addpath('/home/david/git/facenet/util/'); -log_dirs = { '/home/david/logs/facenet' }; -%% -res = { ... -{ '20180402-114759', 'vggface2, wd=5e-4, center crop, fixed image standardization' }, ... -}; - -%% -res = { ... -{ '20180408-102900', 'casia, wd=5e-4, pnlf=5e-4, fixed image standardization' }, ... -}; - -%% - -colors = {'b', 'g', 'r', 'c', 'm', 'y', 'k'}; -markers = {'.', 'o', 'x', '+', '*', 's', 'd' }; -lines = {'-', '-.', '--', ':' }; -fontSize = 6; -lineWidth = 2; -lineStyles = combineStyles(colors, markers); -lineStyles2 = combineStyles(colors, {''}, lines); -legends = cell(length(res),1); -legends_accuracy = cell(length(res),1); -legends_valrate = cell(length(res),1); -var = cell(length(res),1); -for i=1:length(res), - for k=1:length(log_dirs) - if exist(fullfile(log_dirs{k}, res{i}{1}), 'dir') - ld = log_dirs{k}; - end - end - filename = fullfile(ld, res{i}{1}, 'stat.h5'); - - var{i} = readlogs(filename,{'loss', 'reg_loss', 'xent_loss', 'lfw_accuracy', ... - 'lfw_valrate', 'val_loss', 'val_xent_loss', 'val_accuracy', ... - 'accuracy', 'prelogits_norm', 'learning_rate', 'center_loss', ... - 'prelogits_hist', 'accuracy'}); - var{i}.steps = 1:length(var{i}.loss); - epoch = find(var{i}.lfw_accuracy,1,'last'); - var{i}.epochs = 1:epoch; - legends{i} = sprintf('%s: %s', res{i}{1}, res{i}{2}); - start_epoch = max(1,epoch-10); - legends_accuracy{i} = sprintf('%s: %s (%.2f%%)', res{i}{1}, res{i}{2}, mean(var{i}.lfw_accuracy(start_epoch:epoch))*100 ); - legends_valrate{i} = sprintf('%s: %s (%.2f%%)', res{i}{1}, res{i}{2}, mean(var{i}.lfw_valrate(start_epoch:epoch))*100 ); - - arguments_filename = fullfile(ld, res{i}{1}, 'arguments.txt'); - if exist(arguments_filename) - str = fileread(arguments_filename); - var{i}.wd = getParameter(str, 'weight_decay', '0.0'); - var{i}.cl = getParameter(str, 'center_loss_factor', '0.0'); - var{i}.fixed_std = getParameter(str, 'use_fixed_image_standardization', '0'); - var{i}.data_dir = getParameter(str, 'data_dir', ''); - var{i}.lr = getParameter(str, 'learning_rate', '0.1'); - var{i}.epoch_size = str2double(getParameter(str, 'epoch_size', '1000')); - var{i}.batch_size = str2double(getParameter(str, 'batch_size', '90')); - var{i}.examples_per_epoch = var{i}.epoch_size*var{i}.batch_size; - var{i}.mnipc = getParameter(str, 'filter_min_nrof_images_per_class', '-1'); - var{i}.val_step = str2num(getParameter(str, 'validate_every_n_epochs', '10')); - var{i}.pnlf = getParameter(str, 'prelogits_norm_loss_factor', '-1'); - var{i}.emb_size = getParameter(str, 'embedding_size', '-1'); - - fprintf('%s: wd=%s lr=%s, pnlf=%s, data_dir=%s, emb_size=%s\n', ... - res{i}{1}, var{i}.wd, var{i}.lr, var{i}.pnlf, var{i}.data_dir, var{i}.emb_size); - end -end; - -timestr = datestr(now,'yyyymmdd_HHMMSS'); - -h = 1; figure(h); close(h); figure(h); hold on; setsize(1.5); -title('LFW accuracy'); -xlabel('Steps'); -ylabel('Accuracy'); -grid on; -N = 1; flt = ones(1,N)/N; -for i=1:length(var), - plot(var{i}.epochs*1000, filter(flt, 1, var{i}.lfw_accuracy(var{i}.epochs)), lineStyles2{i}, 'LineWidth', lineWidth); -end; -legend(legends_accuracy,'Location','SouthEast','FontSize',fontSize); -v=axis; -v(3:4) = [ 0.95 1.0 ]; -axis(v); -accuracy_file_name = sprintf('lfw_accuracy_%s',timestr); -%print(accuracy_file_name,'-dpng') - - -if 0 - %% - %h = 2; figure(h); close(h); figure(h); hold on; setsize(1.5); - h = 1; figure(h); hold on; - title('LFW validation rate'); - xlabel('Step'); - ylabel('VAL @ FAR = 10^{-3}'); - grid on; - for i=1:length(var), - plot(var{i}.epochs*1000, var{i}.lfw_valrate(var{i}.epochs), lineStyles{i}, 'LineWidth', lineWidth); - end; - legend(legends_valrate,'Location','SouthEast','FontSize',fontSize); - v=axis; - v(3:4) = [ 0.5 1.0 ]; - axis(v); - valrate_file_name = sprintf('lfw_valrate_%s',timestr); -% print(valrate_file_name,'-dpng') -end - -if 0 - %% Plot cross-entropy loss - h = 3; figure(h); close(h); figure(h); hold on; setsize(1.5); - title('Training/validation set cross-entropy loss'); - xlabel('Step'); - title('Training/validation set cross-entropy loss'); - grid on; - N = 500; flt = ones(1,N)/N; - for i=1:length(var), - var{i}.xent_loss(var{i}.xent_loss==0) = NaN; - plot(var{i}.steps, filter(flt, 1, var{i}.xent_loss), lineStyles2{i}, 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'NorthEast','FontSize',fontSize); - - % Plot cross-entropy loss on validation set - N = 1; flt = ones(1,N)/N; - for i=1:length(var), - v = var{i}.val_xent_loss; - val_steps = (1:length(v))*var{i}.val_step*1000; - v(v==0) = NaN; - plot(val_steps, filter(flt, 1, v), [ lineStyles2{i} '.' ], 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'NorthEast','FontSize',fontSize); - hold off - xent_file_name = sprintf('xent_%s',timestr); - %print(xent_file_name,'-dpng') -end - -if 0 - %% Plot accuracy on training set - h = 32; figure(h); clf; hold on; - title('Training/validation set accuracy'); - xlabel('Step'); - ylabel('Training/validation set accuracy'); - grid on; - N = 500; flt = ones(1,N)/N; - for i=1:length(var), - var{i}.accuracy(var{i}.accuracy==0) = NaN; - plot(var{i}.steps*1000, filter(flt, 1, var{i}.accuracy), lineStyles2{i}, 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'SouthEast','FontSize',fontSize); - - grid on; - N = 1; flt = ones(1,N)/N; - for i=1:length(var), - v = var{i}.val_accuracy; - val_steps = (1:length(v))*var{i}.val_step*1000; - v(v==0) = NaN; - plot(val_steps*1000, filter(flt, 1, v), [ lineStyles2{i} '.' ], 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'SouthEast','FontSize',fontSize); - hold off - acc_file_name = sprintf('accuracy_%s',timestr); - %print(acc_file_name,'-dpng') -end - -if 0 - %% Plot prelogits CDF - h = 35; figure(h); clf; hold on; - title('Prelogits histogram'); - xlabel('Epoch'); - ylabel('Prelogits histogram'); - grid on; - N = 1; flt = ones(1,N)/N; - for i=1:length(var), - epoch = var{i}.epochs(end); - q = cumsum(var{i}.prelogits_hist(:,epoch)); - q2 = q / q(end); - plot(linspace(0,10,1000), q2, lineStyles2{i}, 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'SouthEast','FontSize',fontSize); - hold off -end - -if 0 - %% Plot prelogits norm - h = 32; figure(h); clf; hold on; - title('Prelogits norm'); - xlabel('Step'); - ylabel('Prelogits norm'); - grid on; - N = 1; flt = ones(1,N)/N; - for i=1:length(var), - plot(var{i}.steps, filter(flt, 1, var{i}.prelogits_norm), lineStyles2{i}, 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'NorthEast','FontSize',fontSize); - hold off -end - -if 0 - %% Plot learning rate - h = 42; figure(h); clf; hold on; - title('Learning rate'); - xlabel('Step'); - ylabel('Learning rate'); - grid on; - N = 1; flt = ones(1,N)/N; - for i=1:length(var), - semilogy(var{i}.epochs, filter(flt, 1, var{i}.learning_rate(var{i}.epochs)), lineStyles2{i}, 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'NorthEast','FontSize',fontSize); - hold off -end - -if 0 - %% Plot center loss - h = 9; figure(h); close(h); figure(h); hold on; setsize(1.5); - title('Center loss'); - xlabel('Epochs'); - ylabel('Center loss'); - grid on; - N = 500; flt = ones(1,N)/N; - for i=1:length(var), - if isempty(var{i}.center_loss) - var{i}.center_loss = ones(size(var{i}.steps))*NaN; - end; - var{i}.center_loss(var{i}.center_loss==0) = NaN; - plot(var{i}.steps/var{i}.epoch_size, filter(flt, 1, var{i}.center_loss), lineStyles2{i}, 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'NorthEast','FontSize',fontSize); -end - -if 0 - %% Plot center loss with factor - h = 9; figure(h); close(h); figure(h); hold on; setsize(1.5); - title('Center loss with factor'); - xlabel('Epochs'); - ylabel('Center loss * center loss factor'); - grid on; - N = 500; flt = ones(1,N)/N; - for i=1:length(var), - if isempty(var{i}.center_loss) - var{i}.center_loss = ones(size(var{i}.steps))*NaN; - end; - var{i}.center_loss(var{i}.center_loss==0) = NaN; - plot(var{i}.steps/var{i}.epoch_size, filter(flt, 1, var{i}.center_loss*str2num(var{i}.cl)), lineStyles2{i}, 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'NorthEast','FontSize',fontSize); -end - -if 0 - %% Plot total loss - h = 4; figure(h); close(h); figure(h); hold on; setsize(1.5); - title('Total loss'); - xlabel('Epochs'); - ylabel('Total loss'); - grid on; - N = 500; flt = ones(1,N)/N; - for i=1:length(var), - var{i}.loss(var{i}.loss==0) = NaN; - plot(var{i}.steps/var{i}.epoch_size, filter(flt, 1, var{i}.loss), lineStyles2{i}, 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'NorthEast','FontSize',fontSize); -end - -if 0 - %% Plot regularization loss - h = 5; figure(h); close(h); figure(h); hold on; setsize(1.5); - title('Regularization loss'); - xlabel('Epochs'); - ylabel('Regularization loss'); - grid on; - N = 500; flt = ones(1,N)/N; - for i=1:length(var), - var{i}.reg_loss(var{i}.reg_loss==0) = NaN; - plot(var{i}.steps/var{i}.epoch_size, filter(flt, 1, var{i}.reg_loss), lineStyles2{i}, 'LineWidth', lineWidth); - end; - legend(legends, 'Location', 'NorthEast','FontSize',fontSize); -end From 106e19459a7daefc4365b2d4f3cf613fffa58a8c Mon Sep 17 00:00:00 2001 From: Sarattha-BAG Date: Tue, 23 Mar 2021 15:22:21 +0700 Subject: [PATCH 2/7] Commit gitignore --- .gitignore | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/.gitignore b/.gitignore index 99c0f9693..4587ac454 100644 --- a/.gitignore +++ b/.gitignore @@ -92,3 +92,12 @@ ENV/ # PyCharm project setting .idea + +datasets +datasets/* +logs +logs/* +models +models/* + +.vscode From 623d7b4c9993a07f869b7c5bbcf3b7b0a55af82d Mon Sep 17 00:00:00 2001 From: Sarattha-BAG Date: Tue, 23 Mar 2021 15:43:11 +0700 Subject: [PATCH 3/7] Add scripts for tensorflow v2 --- .gitignore | 3 --- src_v2/facenet.py | 0 src_v2/models/inception_resnet_v1.py | 26 ++++++++++++++++++++++++++ src_v2/train_softmax.py | 0 4 files changed, 26 insertions(+), 3 deletions(-) create mode 100644 src_v2/facenet.py create mode 100644 src_v2/models/inception_resnet_v1.py create mode 100644 src_v2/train_softmax.py diff --git a/.gitignore b/.gitignore index 4587ac454..151e868c7 100644 --- a/.gitignore +++ b/.gitignore @@ -97,7 +97,4 @@ datasets datasets/* logs logs/* -models -models/* - .vscode diff --git a/src_v2/facenet.py b/src_v2/facenet.py new file mode 100644 index 000000000..e69de29bb diff --git a/src_v2/models/inception_resnet_v1.py b/src_v2/models/inception_resnet_v1.py new file mode 100644 index 000000000..2d72a2dff --- /dev/null +++ b/src_v2/models/inception_resnet_v1.py @@ -0,0 +1,26 @@ +# Copyright 2016 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +"""Contains the definition of the Inception Resnet V1 architecture. +As described in http://arxiv.org/abs/1602.07261. + Inception-v4, Inception-ResNet and the Impact of Residual Connections + on Learning + Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi +""" +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import tensorflow as tf diff --git a/src_v2/train_softmax.py b/src_v2/train_softmax.py new file mode 100644 index 000000000..e69de29bb From 986167c5a85dfef5218fdb18a6e3d0e8ad283c48 Mon Sep 17 00:00:00 2001 From: Sarattha-BAG Date: Wed, 24 Mar 2021 01:10:21 +0700 Subject: [PATCH 4/7] Add new function --- src_v2/facenet.py | 463 +++++++++++++++++++++++++++ src_v2/models/inception_resnet_v1.py | 37 +++ src_v2/train_softmax.py | 404 +++++++++++++++++++++++ 3 files changed, 904 insertions(+) diff --git a/src_v2/facenet.py b/src_v2/facenet.py index e69de29bb..5407ce8f9 100644 --- a/src_v2/facenet.py +++ b/src_v2/facenet.py @@ -0,0 +1,463 @@ +"""Functions for building the face recognition network. +""" +# MIT License +# +# Copyright (c) 2016 David Sandberg +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. + +# pylint: disable=missing-docstring +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import os +from subprocess import Popen, PIPE +import tensorflow as tf +import numpy as np +from scipy import misc +from sklearn.model_selection import KFold +from scipy import interpolate +from tensorflow.python.training import training +import random +import re +from tensorflow.python.platform import gfile +import math +from six import iteritems + +def get_image_paths_and_labels(dataset): + image_paths_flat = [] + labels_flat = [] + for i in range(len(dataset)): + image_paths_flat += dataset[i].image_paths + labels_flat += [i] * len(dataset[i].image_paths) + return image_paths_flat, labels_flat + +def shuffle_examples(image_paths, labels): + shuffle_list = list(zip(image_paths, labels)) + random.shuffle(shuffle_list) + image_paths_shuff, labels_shuff = zip(*shuffle_list) + return image_paths_shuff, labels_shuff + +def random_rotate_image(image): + angle = np.random.uniform(low=-10.0, high=10.0) + return misc.imrotate(image, angle, 'bicubic') + +# 1: Random rotate 2: Random crop 4: Random flip 8: Fixed image standardization 16: Flip +RANDOM_ROTATE = 1 +RANDOM_CROP = 2 +RANDOM_FLIP = 4 +FIXED_STANDARDIZATION = 8 +FLIP = 16 +def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder): + images_and_labels_list = [] + for _ in range(nrof_preprocess_threads): + filenames, label, control = input_queue.dequeue() + images = [] + for filename in tf.unstack(filenames): + file_contents = tf.io.read_file(filename) + image = tf.image.decode_image(file_contents, 3) + image = tf.cond(pred=get_control_flag(control[0], RANDOM_ROTATE), + true_fn=lambda:tf.compat.v1.py_func(random_rotate_image, [image], tf.uint8), + false_fn=lambda:tf.identity(image)) + image = tf.cond(pred=get_control_flag(control[0], RANDOM_CROP), + true_fn=lambda:tf.image.random_crop(image, image_size + (3,)), + false_fn=lambda:tf.image.resize_with_crop_or_pad(image, image_size[0], image_size[1])) + image = tf.cond(pred=get_control_flag(control[0], RANDOM_FLIP), + true_fn=lambda:tf.image.random_flip_left_right(image), + false_fn=lambda:tf.identity(image)) + image = tf.cast(image, tf.float32) + image = tf.cond(pred=get_control_flag(control[0], FIXED_STANDARDIZATION), + true_fn=lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, + false_fn=lambda:tf.image.per_image_standardization(image)) + image = tf.cond(pred=get_control_flag(control[0], FLIP), + true_fn=lambda:tf.image.flip_left_right(image), + false_fn=lambda:tf.identity(image)) + #pylint: disable=no-member + image.set_shape(image_size + (3,)) + images.append(image) + images_and_labels_list.append([images, label]) + + image_batch, label_batch = tf.compat.v1.train.batch_join( + images_and_labels_list, batch_size=batch_size_placeholder, + shapes=[image_size + (3,), ()], enqueue_many=True, + capacity=4 * nrof_preprocess_threads * 100, + allow_smaller_final_batch=True) + + return image_batch, label_batch + +def get_control_flag(control, field): + return tf.equal(tf.math.mod(tf.math.floordiv(control, field), 2), 1) + +def prewhiten(x): + mean = np.mean(x) + std = np.std(x) + std_adj = np.maximum(std, 1.0/np.sqrt(x.size)) + y = np.multiply(np.subtract(x, mean), 1/std_adj) + return y + +def crop(image, random_crop, image_size): + if image.shape[1]>image_size: + sz1 = int(image.shape[1]//2) + sz2 = int(image_size//2) + if random_crop: + diff = sz1-sz2 + (h, v) = (np.random.randint(-diff, diff+1), np.random.randint(-diff, diff+1)) + else: + (h, v) = (0,0) + image = image[(sz1-sz2+v):(sz1+sz2+v),(sz1-sz2+h):(sz1+sz2+h),:] + return image + +def flip(image, random_flip): + if random_flip and np.random.choice([True, False]): + image = np.fliplr(image) + return image + +def to_rgb(img): + w, h = img.shape + ret = np.empty((w, h, 3), dtype=np.uint8) + ret[:, :, 0] = ret[:, :, 1] = ret[:, :, 2] = img + return ret + +def load_data(image_paths, do_random_crop, do_random_flip, image_size, do_prewhiten=True): + nrof_samples = len(image_paths) + images = np.zeros((nrof_samples, image_size, image_size, 3)) + for i in range(nrof_samples): + img = misc.imread(image_paths[i]) + if img.ndim == 2: + img = to_rgb(img) + if do_prewhiten: + img = prewhiten(img) + img = crop(img, do_random_crop, image_size) + img = flip(img, do_random_flip) + images[i,:,:,:] = img + return images + +def get_label_batch(label_data, batch_size, batch_index): + nrof_examples = np.size(label_data, 0) + j = batch_index*batch_size % nrof_examples + if j+batch_size<=nrof_examples: + batch = label_data[j:j+batch_size] + else: + x1 = label_data[j:nrof_examples] + x2 = label_data[0:nrof_examples-j] + batch = np.vstack([x1,x2]) + batch_int = batch.astype(np.int64) + return batch_int + +def get_batch(image_data, batch_size, batch_index): + nrof_examples = np.size(image_data, 0) + j = batch_index*batch_size % nrof_examples + if j+batch_size<=nrof_examples: + batch = image_data[j:j+batch_size,:,:,:] + else: + x1 = image_data[j:nrof_examples,:,:,:] + x2 = image_data[0:nrof_examples-j,:,:,:] + batch = np.vstack([x1,x2]) + batch_float = batch.astype(np.float32) + return batch_float + +def get_triplet_batch(triplets, batch_index, batch_size): + ax, px, nx = triplets + a = get_batch(ax, int(batch_size/3), batch_index) + p = get_batch(px, int(batch_size/3), batch_index) + n = get_batch(nx, int(batch_size/3), batch_index) + batch = np.vstack([a, p, n]) + return batch + +def get_learning_rate_from_file(filename, epoch): + with open(filename, 'r') as f: + for line in f.readlines(): + line = line.split('#', 1)[0] + if line: + par = line.strip().split(':') + e = int(par[0]) + if par[1]=='-': + lr = -1 + else: + lr = float(par[1]) + if e <= epoch: + learning_rate = lr + else: + return learning_rate + +class ImageClass(): + "Stores the paths to images for a given class" + def __init__(self, name, image_paths): + self.name = name + self.image_paths = image_paths + + def __str__(self): + return self.name + ', ' + str(len(self.image_paths)) + ' images' + + def __len__(self): + return len(self.image_paths) + +def get_dataset(path, has_class_directories=True): + dataset = [] + path_exp = os.path.expanduser(path) + classes = [path for path in os.listdir(path_exp) \ + if os.path.isdir(os.path.join(path_exp, path))] + classes.sort() + nrof_classes = len(classes) + for i in range(nrof_classes): + class_name = classes[i] + facedir = os.path.join(path_exp, class_name) + image_paths = get_image_paths(facedir) + dataset.append(ImageClass(class_name, image_paths)) + + return dataset + +def get_image_paths(facedir): + image_paths = [] + if os.path.isdir(facedir): + images = os.listdir(facedir) + image_paths = [os.path.join(facedir,img) for img in images] + return image_paths + +def split_dataset(dataset, split_ratio, min_nrof_images_per_class, mode): + if mode=='SPLIT_CLASSES': + nrof_classes = len(dataset) + class_indices = np.arange(nrof_classes) + np.random.shuffle(class_indices) + split = int(round(nrof_classes*(1-split_ratio))) + train_set = [dataset[i] for i in class_indices[0:split]] + test_set = [dataset[i] for i in class_indices[split:-1]] + elif mode=='SPLIT_IMAGES': + train_set = [] + test_set = [] + for cls in dataset: + paths = cls.image_paths + np.random.shuffle(paths) + nrof_images_in_class = len(paths) + split = int(math.floor(nrof_images_in_class*(1-split_ratio))) + if split==nrof_images_in_class: + split = nrof_images_in_class-1 + if split>=min_nrof_images_per_class and nrof_images_in_class-split>=1: + train_set.append(ImageClass(cls.name, paths[:split])) + test_set.append(ImageClass(cls.name, paths[split:])) + else: + raise ValueError('Invalid train/test split mode "%s"' % mode) + return train_set, test_set + +def load_model(model, input_map=None): + # Check if the model is a model directory (containing a metagraph and a checkpoint file) + # or if it is a protobuf file with a frozen graph + model_exp = os.path.expanduser(model) + if (os.path.isfile(model_exp)): + print('Model filename: %s' % model_exp) + with gfile.FastGFile(model_exp,'rb') as f: + graph_def = tf.compat.v1.GraphDef() + graph_def.ParseFromString(f.read()) + tf.import_graph_def(graph_def, input_map=input_map, name='') + else: + print('Model directory: %s' % model_exp) + meta_file, ckpt_file = get_model_filenames(model_exp) + + print('Metagraph file: %s' % meta_file) + print('Checkpoint file: %s' % ckpt_file) + + saver = tf.compat.v1.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map) + saver.restore(tf.compat.v1.get_default_session(), os.path.join(model_exp, ckpt_file)) + +def get_model_filenames(model_dir): + files = os.listdir(model_dir) + meta_files = [s for s in files if s.endswith('.meta')] + if len(meta_files)==0: + raise ValueError('No meta file found in the model directory (%s)' % model_dir) + elif len(meta_files)>1: + raise ValueError('There should not be more than one meta file in the model directory (%s)' % model_dir) + meta_file = meta_files[0] + ckpt = tf.train.get_checkpoint_state(model_dir) + if ckpt and ckpt.model_checkpoint_path: + ckpt_file = os.path.basename(ckpt.model_checkpoint_path) + return meta_file, ckpt_file + + meta_files = [s for s in files if '.ckpt' in s] + max_step = -1 + for f in files: + step_str = re.match(r'(^model-[\w\- ]+.ckpt-(\d+))', f) + if step_str is not None and len(step_str.groups())>=2: + step = int(step_str.groups()[1]) + if step > max_step: + max_step = step + ckpt_file = step_str.groups()[0] + return meta_file, ckpt_file + +def distance(embeddings1, embeddings2, distance_metric=0): + if distance_metric==0: + # Euclidian distance + diff = np.subtract(embeddings1, embeddings2) + dist = np.sum(np.square(diff),1) + elif distance_metric==1: + # Distance based on cosine similarity + dot = np.sum(np.multiply(embeddings1, embeddings2), axis=1) + norm = np.linalg.norm(embeddings1, axis=1) * np.linalg.norm(embeddings2, axis=1) + similarity = dot / norm + dist = np.arccos(similarity) / math.pi + else: + raise 'Undefined distance metric %d' % distance_metric + + return dist + +def calculate_roc(thresholds, embeddings1, embeddings2, actual_issame, nrof_folds=10, distance_metric=0, subtract_mean=False): + assert(embeddings1.shape[0] == embeddings2.shape[0]) + assert(embeddings1.shape[1] == embeddings2.shape[1]) + nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) + nrof_thresholds = len(thresholds) + k_fold = KFold(n_splits=nrof_folds, shuffle=False) + + tprs = np.zeros((nrof_folds,nrof_thresholds)) + fprs = np.zeros((nrof_folds,nrof_thresholds)) + accuracy = np.zeros((nrof_folds)) + + indices = np.arange(nrof_pairs) + + for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)): + if subtract_mean: + mean = np.mean(np.concatenate([embeddings1[train_set], embeddings2[train_set]]), axis=0) + else: + mean = 0.0 + dist = distance(embeddings1-mean, embeddings2-mean, distance_metric) + + # Find the best threshold for the fold + acc_train = np.zeros((nrof_thresholds)) + for threshold_idx, threshold in enumerate(thresholds): + _, _, acc_train[threshold_idx] = calculate_accuracy(threshold, dist[train_set], actual_issame[train_set]) + best_threshold_index = np.argmax(acc_train) + for threshold_idx, threshold in enumerate(thresholds): + tprs[fold_idx,threshold_idx], fprs[fold_idx,threshold_idx], _ = calculate_accuracy(threshold, dist[test_set], actual_issame[test_set]) + _, _, accuracy[fold_idx] = calculate_accuracy(thresholds[best_threshold_index], dist[test_set], actual_issame[test_set]) + + tpr = np.mean(tprs,0) + fpr = np.mean(fprs,0) + return tpr, fpr, accuracy + +def calculate_accuracy(threshold, dist, actual_issame): + predict_issame = np.less(dist, threshold) + tp = np.sum(np.logical_and(predict_issame, actual_issame)) + fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) + tn = np.sum(np.logical_and(np.logical_not(predict_issame), np.logical_not(actual_issame))) + fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame)) + + tpr = 0 if (tp+fn==0) else float(tp) / float(tp+fn) + fpr = 0 if (fp+tn==0) else float(fp) / float(fp+tn) + acc = float(tp+tn)/dist.size + return tpr, fpr, acc + +def calculate_val(thresholds, embeddings1, embeddings2, actual_issame, far_target, nrof_folds=10, distance_metric=0, subtract_mean=False): + assert(embeddings1.shape[0] == embeddings2.shape[0]) + assert(embeddings1.shape[1] == embeddings2.shape[1]) + nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) + nrof_thresholds = len(thresholds) + k_fold = KFold(n_splits=nrof_folds, shuffle=False) + + val = np.zeros(nrof_folds) + far = np.zeros(nrof_folds) + + indices = np.arange(nrof_pairs) + + for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)): + if subtract_mean: + mean = np.mean(np.concatenate([embeddings1[train_set], embeddings2[train_set]]), axis=0) + else: + mean = 0.0 + dist = distance(embeddings1-mean, embeddings2-mean, distance_metric) + + # Find the threshold that gives FAR = far_target + far_train = np.zeros(nrof_thresholds) + for threshold_idx, threshold in enumerate(thresholds): + _, far_train[threshold_idx] = calculate_val_far(threshold, dist[train_set], actual_issame[train_set]) + if np.max(far_train)>=far_target: + f = interpolate.interp1d(far_train, thresholds, kind='slinear') + threshold = f(far_target) + else: + threshold = 0.0 + + val[fold_idx], far[fold_idx] = calculate_val_far(threshold, dist[test_set], actual_issame[test_set]) + + val_mean = np.mean(val) + far_mean = np.mean(far) + val_std = np.std(val) + return val_mean, val_std, far_mean + +def calculate_val_far(threshold, dist, actual_issame): + predict_issame = np.less(dist, threshold) + true_accept = np.sum(np.logical_and(predict_issame, actual_issame)) + false_accept = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) + n_same = np.sum(actual_issame) + n_diff = np.sum(np.logical_not(actual_issame)) + val = float(true_accept) / float(n_same) + far = float(false_accept) / float(n_diff) + return val, far + +def store_revision_info(src_path, output_dir, arg_string): + try: + # Get git hash + cmd = ['git', 'rev-parse', 'HEAD'] + gitproc = Popen(cmd, stdout = PIPE, cwd=src_path) + (stdout, _) = gitproc.communicate() + git_hash = stdout.strip() + except OSError as e: + git_hash = ' '.join(cmd) + ': ' + e.strerror + + try: + # Get local changes + cmd = ['git', 'diff', 'HEAD'] + gitproc = Popen(cmd, stdout = PIPE, cwd=src_path) + (stdout, _) = gitproc.communicate() + git_diff = stdout.strip() + except OSError as e: + git_diff = ' '.join(cmd) + ': ' + e.strerror + + # Store a text file in the log directory + rev_info_filename = os.path.join(output_dir, 'revision_info.txt') + with open(rev_info_filename, "w") as text_file: + text_file.write('arguments: %s\n--------------------\n' % arg_string) + text_file.write('tensorflow version: %s\n--------------------\n' % tf.__version__) # @UndefinedVariable + text_file.write('git hash: %s\n--------------------\n' % git_hash) + text_file.write('%s' % git_diff) + +def list_variables(filename): + reader = training.NewCheckpointReader(filename) + variable_map = reader.get_variable_to_shape_map() + names = sorted(variable_map.keys()) + return names + +def put_images_on_grid(images, shape=(16,8)): + nrof_images = images.shape[0] + img_size = images.shape[1] + bw = 3 + img = np.zeros((shape[1]*(img_size+bw)+bw, shape[0]*(img_size+bw)+bw, 3), np.float32) + for i in range(shape[1]): + x_start = i*(img_size+bw)+bw + for j in range(shape[0]): + img_index = i*shape[0]+j + if img_index>=nrof_images: + break + y_start = j*(img_size+bw)+bw + img[x_start:x_start+img_size, y_start:y_start+img_size, :] = images[img_index, :, :, :] + if img_index>=nrof_images: + break + return img + +def write_arguments_to_file(args, filename): + with open(filename, 'w') as f: + for key, value in iteritems(vars(args)): + f.write('%s: %s\n' % (key, str(value))) \ No newline at end of file diff --git a/src_v2/models/inception_resnet_v1.py b/src_v2/models/inception_resnet_v1.py index 2d72a2dff..26fe33315 100644 --- a/src_v2/models/inception_resnet_v1.py +++ b/src_v2/models/inception_resnet_v1.py @@ -24,3 +24,40 @@ from __future__ import print_function import tensorflow as tf + +class Block35(tf.keras.Model): + def __init__(self): + super(Block35, self).__init__(name='') + # Branch_0 + self.conv1 = tf.keras.layers.Conv2D(32, (1, 1),padding='same') + # Branch_1 + self.conv2a = tf.keras.layers.Conv2D(32, (1, 1),padding='same') + self.conv2b = tf.keras.layers.Conv2D(32, (3, 3),padding='same') + # Branch_2 + self.conv3a = tf.keras.layers.Conv2D(32, (1, 1),padding='same') + self.conv3b = tf.keras.layers.Conv2D(32, (3, 3),padding='same') + self.conv3c = tf.keras.layers.Conv2D(32, (3, 3),padding='same') + # Up + self.convup = tf.keras.layers.Conv2D(32, (1, 1),padding='same') + + + def call(self, input_tensor, scale = 1.0, activation_fn=tf.nn.relu,): + # Branch_0 + x = self.conv1(input_tensor) + # Branch_1 + y_1 = self.conv2a(input_tensor) + y_2 = self.conv2b(y_1) + # Branch_2 + z_1 = self.conv3a(input_tensor) + z_2 = self.conv3b(z_1) + z_3 = self.conv3c(z_2) + + mixed = tf.concat([x, y_2, z_3], 3) + up = tf.keras.layers.Conv2D(input_tensor.get_shape()[3], (1,1))(mixed) + + input_tensor += scale * up + if activation_fn: + input_tensor = activation_fn(input_tensor) + + + return input_tensor \ No newline at end of file diff --git a/src_v2/train_softmax.py b/src_v2/train_softmax.py index e69de29bb..49b3e6806 100644 --- a/src_v2/train_softmax.py +++ b/src_v2/train_softmax.py @@ -0,0 +1,404 @@ +"""Training a face recognizer with TensorFlow using softmax cross entropy loss +""" +# MIT License +# +# Copyright (c) 2016 David Sandberg +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +from datetime import datetime +import os.path +import time +import sys +import random +import tensorflow as tf +import numpy as np +import importlib +import argparse +import facenet +import lfw +import h5py +import math +from tensorflow.python.ops import data_flow_ops +from tensorflow.python.framework import ops +from tensorflow.python.ops import array_ops + +def main(args): + network = importlib.import_module(args.model_def) + image_size = (args.image_size, args.image_size) + + subdir = datetime.strftime(datetime.now(), '%Y%m%d-%H%M%S') + log_dir = os.path.join(os.path.expanduser(args.logs_base_dir), subdir) + if not os.path.isdir(log_dir): # Create the log directory if it doesn't exist + os.makedirs(log_dir) + model_dir = os.path.join(os.path.expanduser(args.models_base_dir), subdir) + if not os.path.isdir(model_dir): # Create the model directory if it doesn't exist + os.makedirs(model_dir) + + stat_file_name = os.path.join(log_dir, 'stat.h5') + + # Write arguments to a text file + facenet.write_arguments_to_file(args, os.path.join(log_dir, 'arguments.txt')) + + # Store some git revision info in a text file in the log directory + src_path,_ = os.path.split(os.path.realpath(__file__)) + facenet.store_revision_info(src_path, log_dir, ' '.join(sys.argv)) + + np.random.seed(seed=args.seed) + random.seed(args.seed) + dataset = facenet.get_dataset(args.data_dir) + if args.filter_filename: + dataset = filter_dataset(dataset, os.path.expanduser(args.filter_filename), + args.filter_percentile, args.filter_min_nrof_images_per_class) + + if args.validation_set_split_ratio>0.0: + train_set, val_set = facenet.split_dataset(dataset, args.validation_set_split_ratio, args.min_nrof_val_images_per_class, 'SPLIT_IMAGES') + else: + train_set, val_set = dataset, [] + + nrof_classes = len(train_set) + + print('Model directory: %s' % model_dir) + print('Log directory: %s' % log_dir) + + pretrained_model = None + if args.pretrained_model: + pretrained_model = os.path.expanduser(args.pretrained_model) + print('Pre-trained model: %s' % pretrained_model) + + +def find_threshold(var, percentile): + hist, bin_edges = np.histogram(var, 100) + cdf = np.float32(np.cumsum(hist)) / np.sum(hist) + bin_centers = (bin_edges[:-1]+bin_edges[1:])/2 + #plt.plot(bin_centers, cdf) + threshold = np.interp(percentile*0.01, cdf, bin_centers) + return threshold + +def filter_dataset(dataset, data_filename, percentile, min_nrof_images_per_class): + with h5py.File(data_filename,'r') as f: + distance_to_center = np.array(f.get('distance_to_center')) + label_list = np.array(f.get('label_list')) + image_list = np.array(f.get('image_list')) + distance_to_center_threshold = find_threshold(distance_to_center, percentile) + indices = np.where(distance_to_center>=distance_to_center_threshold)[0] + filtered_dataset = dataset + removelist = [] + for i in indices: + label = label_list[i] + image = image_list[i] + if image in filtered_dataset[label].image_paths: + filtered_dataset[label].image_paths.remove(image) + if len(filtered_dataset[label].image_paths)0.0: + lr = args.learning_rate + else: + lr = facenet.get_learning_rate_from_file(learning_rate_schedule_file, epoch) + + if lr<=0: + return False + + index_epoch = sess.run(index_dequeue_op) + label_epoch = np.array(label_list)[index_epoch] + image_epoch = np.array(image_list)[index_epoch] + + # Enqueue one epoch of image paths and labels + labels_array = np.expand_dims(np.array(label_epoch),1) + image_paths_array = np.expand_dims(np.array(image_epoch),1) + control_value = facenet.RANDOM_ROTATE * random_rotate + facenet.RANDOM_CROP * random_crop + facenet.RANDOM_FLIP * random_flip + facenet.FIXED_STANDARDIZATION * use_fixed_image_standardization + control_array = np.ones_like(labels_array) * control_value + sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array}) + + # Training loop + train_time = 0 + while batch_number < args.epoch_size: + start_time = time.time() + feed_dict = {learning_rate_placeholder: lr, phase_train_placeholder:True, batch_size_placeholder:args.batch_size} + tensor_list = [loss, train_op, step, reg_losses, prelogits, cross_entropy_mean, learning_rate, prelogits_norm, accuracy, prelogits_center_loss] + if batch_number % 100 == 0: + loss_, _, step_, reg_losses_, prelogits_, cross_entropy_mean_, lr_, prelogits_norm_, accuracy_, center_loss_, summary_str = sess.run(tensor_list + [summary_op], feed_dict=feed_dict) + summary_writer.add_summary(summary_str, global_step=step_) + else: + loss_, _, step_, reg_losses_, prelogits_, cross_entropy_mean_, lr_, prelogits_norm_, accuracy_, center_loss_ = sess.run(tensor_list, feed_dict=feed_dict) + + duration = time.time() - start_time + stat['loss'][step_-1] = loss_ + stat['center_loss'][step_-1] = center_loss_ + stat['reg_loss'][step_-1] = np.sum(reg_losses_) + stat['xent_loss'][step_-1] = cross_entropy_mean_ + stat['prelogits_norm'][step_-1] = prelogits_norm_ + stat['learning_rate'][epoch-1] = lr_ + stat['accuracy'][step_-1] = accuracy_ + stat['prelogits_hist'][epoch-1,:] += np.histogram(np.minimum(np.abs(prelogits_), prelogits_hist_max), bins=1000, range=(0.0, prelogits_hist_max))[0] + + duration = time.time() - start_time + print('Epoch: [%d][%d/%d]\tTime %.3f\tLoss %2.3f\tXent %2.3f\tRegLoss %2.3f\tAccuracy %2.3f\tLr %2.5f\tCl %2.3f' % + (epoch, batch_number+1, args.epoch_size, duration, loss_, cross_entropy_mean_, np.sum(reg_losses_), accuracy_, lr_, center_loss_)) + batch_number += 1 + train_time += duration + # Add validation loss and accuracy to summary + summary = tf.summary() + #pylint: disable=maybe-no-member + summary.value.add(tag='time/total', simple_value=train_time) + summary_writer.add_summary(summary, global_step=step_) + return True + +def validate(args, sess, epoch, image_list, label_list, enqueue_op, image_paths_placeholder, labels_placeholder, control_placeholder, + phase_train_placeholder, batch_size_placeholder, + stat, loss, regularization_losses, cross_entropy_mean, accuracy, validate_every_n_epochs, use_fixed_image_standardization): + + print('Running forward pass on validation set') + + nrof_batches = len(label_list) // args.lfw_batch_size + nrof_images = nrof_batches * args.lfw_batch_size + + # Enqueue one epoch of image paths and labels + labels_array = np.expand_dims(np.array(label_list[:nrof_images]),1) + image_paths_array = np.expand_dims(np.array(image_list[:nrof_images]),1) + control_array = np.ones_like(labels_array, np.int32)*facenet.FIXED_STANDARDIZATION * use_fixed_image_standardization + sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array}) + + loss_array = np.zeros((nrof_batches,), np.float32) + xent_array = np.zeros((nrof_batches,), np.float32) + accuracy_array = np.zeros((nrof_batches,), np.float32) + + # Training loop + start_time = time.time() + for i in range(nrof_batches): + feed_dict = {phase_train_placeholder:False, batch_size_placeholder:args.lfw_batch_size} + loss_, cross_entropy_mean_, accuracy_ = sess.run([loss, cross_entropy_mean, accuracy], feed_dict=feed_dict) + loss_array[i], xent_array[i], accuracy_array[i] = (loss_, cross_entropy_mean_, accuracy_) + if i % 10 == 9: + print('.', end='') + sys.stdout.flush() + print('') + + duration = time.time() - start_time + + val_index = (epoch-1)//validate_every_n_epochs + stat['val_loss'][val_index] = np.mean(loss_array) + stat['val_xent_loss'][val_index] = np.mean(xent_array) + stat['val_accuracy'][val_index] = np.mean(accuracy_array) + + print('Validation Epoch: %d\tTime %.3f\tLoss %2.3f\tXent %2.3f\tAccuracy %2.3f' % + (epoch, duration, np.mean(loss_array), np.mean(xent_array), np.mean(accuracy_array))) + +def evaluate(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, + embeddings, labels, image_paths, actual_issame, batch_size, nrof_folds, log_dir, step, summary_writer, stat, epoch, distance_metric, subtract_mean, use_flipped_images, use_fixed_image_standardization): + start_time = time.time() + # Run forward pass to calculate embeddings + print('Runnning forward pass on LFW images') + + # Enqueue one epoch of image paths and labels + nrof_embeddings = len(actual_issame)*2 # nrof_pairs * nrof_images_per_pair + nrof_flips = 2 if use_flipped_images else 1 + nrof_images = nrof_embeddings * nrof_flips + labels_array = np.expand_dims(np.arange(0,nrof_images),1) + image_paths_array = np.expand_dims(np.repeat(np.array(image_paths),nrof_flips),1) + control_array = np.zeros_like(labels_array, np.int32) + if use_fixed_image_standardization: + control_array += np.ones_like(labels_array)*facenet.FIXED_STANDARDIZATION + if use_flipped_images: + # Flip every second image + control_array += (labels_array % 2)*facenet.FLIP + sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array}) + + embedding_size = int(embeddings.get_shape()[1]) + assert nrof_images % batch_size == 0, 'The number of LFW images must be an integer multiple of the LFW batch size' + nrof_batches = nrof_images // batch_size + emb_array = np.zeros((nrof_images, embedding_size)) + lab_array = np.zeros((nrof_images,)) + for i in range(nrof_batches): + feed_dict = {phase_train_placeholder:False, batch_size_placeholder:batch_size} + emb, lab = sess.run([embeddings, labels], feed_dict=feed_dict) + lab_array[lab] = lab + emb_array[lab, :] = emb + if i % 10 == 9: + print('.', end='') + sys.stdout.flush() + print('') + embeddings = np.zeros((nrof_embeddings, embedding_size*nrof_flips)) + if use_flipped_images: + # Concatenate embeddings for flipped and non flipped version of the images + embeddings[:,:embedding_size] = emb_array[0::2,:] + embeddings[:,embedding_size:] = emb_array[1::2,:] + else: + embeddings = emb_array + + assert np.array_equal(lab_array, np.arange(nrof_images))==True, 'Wrong labels used for evaluation, possibly caused by training examples left in the input pipeline' + _, _, accuracy, val, val_std, far = lfw.evaluate(embeddings, actual_issame, nrof_folds=nrof_folds, distance_metric=distance_metric, subtract_mean=subtract_mean) + + print('Accuracy: %2.5f+-%2.5f' % (np.mean(accuracy), np.std(accuracy))) + print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far)) + lfw_time = time.time() - start_time + # Add validation loss and accuracy to summary + summary = tf.compat.v1.Summary() + #pylint: disable=maybe-no-member + summary.value.add(tag='lfw/accuracy', simple_value=np.mean(accuracy)) + summary.value.add(tag='lfw/val_rate', simple_value=val) + summary.value.add(tag='time/lfw', simple_value=lfw_time) + summary_writer.add_summary(summary, step) + with open(os.path.join(log_dir,'lfw_result.txt'),'at') as f: + f.write('%d\t%.5f\t%.5f\n' % (step, np.mean(accuracy), val)) + stat['lfw_accuracy'][epoch-1] = np.mean(accuracy) + stat['lfw_valrate'][epoch-1] = val + +def save_variables_and_metagraph(sess, saver, summary_writer, model_dir, model_name, step): + # Save the model checkpoint + print('Saving variables') + start_time = time.time() + checkpoint_path = os.path.join(model_dir, 'model-%s.ckpt' % model_name) + saver.save(sess, checkpoint_path, global_step=step, write_meta_graph=False) + save_time_variables = time.time() - start_time + print('Variables saved in %.2f seconds' % save_time_variables) + metagraph_filename = os.path.join(model_dir, 'model-%s.meta' % model_name) + save_time_metagraph = 0 + if not os.path.exists(metagraph_filename): + print('Saving metagraph') + start_time = time.time() + saver.export_meta_graph(metagraph_filename) + save_time_metagraph = time.time() - start_time + print('Metagraph saved in %.2f seconds' % save_time_metagraph) + summary = tf.compat.v1.Summary() + #pylint: disable=maybe-no-member + summary.value.add(tag='time/save_variables', simple_value=save_time_variables) + summary.value.add(tag='time/save_metagraph', simple_value=save_time_metagraph) + summary_writer.add_summary(summary, step) + + +def parse_arguments(argv): + parser = argparse.ArgumentParser() + + parser.add_argument('--logs_base_dir', type=str, + help='Directory where to write event logs.', default='~/logs/facenet') + parser.add_argument('--models_base_dir', type=str, + help='Directory where to write trained models and checkpoints.', default='~/models/facenet') + parser.add_argument('--gpu_memory_fraction', type=float, + help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0) + parser.add_argument('--pretrained_model', type=str, + help='Load a pretrained model before training starts.') + parser.add_argument('--data_dir', type=str, + help='Path to the data directory containing aligned face patches.', + default='~/datasets/casia/casia_maxpy_mtcnnalign_182_160') + parser.add_argument('--model_def', type=str, + help='Model definition. Points to a module containing the definition of the inference graph.', default='models.inception_resnet_v1') + parser.add_argument('--max_nrof_epochs', type=int, + help='Number of epochs to run.', default=500) + parser.add_argument('--batch_size', type=int, + help='Number of images to process in a batch.', default=90) + parser.add_argument('--image_size', type=int, + help='Image size (height, width) in pixels.', default=160) + parser.add_argument('--epoch_size', type=int, + help='Number of batches per epoch.', default=1000) + parser.add_argument('--embedding_size', type=int, + help='Dimensionality of the embedding.', default=128) + parser.add_argument('--random_crop', + help='Performs random cropping of training images. If false, the center image_size pixels from the training images are used. ' + + 'If the size of the images in the data directory is equal to image_size no cropping is performed', action='store_true') + parser.add_argument('--random_flip', + help='Performs random horizontal flipping of training images.', action='store_true') + parser.add_argument('--random_rotate', + help='Performs random rotations of training images.', action='store_true') + parser.add_argument('--use_fixed_image_standardization', + help='Performs fixed standardization of images.', action='store_true') + parser.add_argument('--keep_probability', type=float, + help='Keep probability of dropout for the fully connected layer(s).', default=1.0) + parser.add_argument('--weight_decay', type=float, + help='L2 weight regularization.', default=0.0) + parser.add_argument('--center_loss_factor', type=float, + help='Center loss factor.', default=0.0) + parser.add_argument('--center_loss_alfa', type=float, + help='Center update rate for center loss.', default=0.95) + parser.add_argument('--prelogits_norm_loss_factor', type=float, + help='Loss based on the norm of the activations in the prelogits layer.', default=0.0) + parser.add_argument('--prelogits_norm_p', type=float, + help='Norm to use for prelogits norm loss.', default=1.0) + parser.add_argument('--prelogits_hist_max', type=float, + help='The max value for the prelogits histogram.', default=10.0) + parser.add_argument('--optimizer', type=str, choices=['ADAGRAD', 'ADADELTA', 'ADAM', 'RMSPROP', 'MOM'], + help='The optimization algorithm to use', default='ADAGRAD') + parser.add_argument('--learning_rate', type=float, + help='Initial learning rate. If set to a negative value a learning rate ' + + 'schedule can be specified in the file "learning_rate_schedule.txt"', default=0.1) + parser.add_argument('--learning_rate_decay_epochs', type=int, + help='Number of epochs between learning rate decay.', default=100) + parser.add_argument('--learning_rate_decay_factor', type=float, + help='Learning rate decay factor.', default=1.0) + parser.add_argument('--moving_average_decay', type=float, + help='Exponential decay for tracking of training parameters.', default=0.9999) + parser.add_argument('--seed', type=int, + help='Random seed.', default=666) + parser.add_argument('--nrof_preprocess_threads', type=int, + help='Number of preprocessing (data loading and augmentation) threads.', default=4) + parser.add_argument('--log_histograms', + help='Enables logging of weight/bias histograms in tensorboard.', action='store_true') + parser.add_argument('--learning_rate_schedule_file', type=str, + help='File containing the learning rate schedule that is used when learning_rate is set to to -1.', default='data/learning_rate_schedule.txt') + parser.add_argument('--filter_filename', type=str, + help='File containing image data used for dataset filtering', default='') + parser.add_argument('--filter_percentile', type=float, + help='Keep only the percentile images closed to its class center', default=100.0) + parser.add_argument('--filter_min_nrof_images_per_class', type=int, + help='Keep only the classes with this number of examples or more', default=0) + parser.add_argument('--validate_every_n_epochs', type=int, + help='Number of epoch between validation', default=5) + parser.add_argument('--validation_set_split_ratio', type=float, + help='The ratio of the total dataset to use for validation', default=0.0) + parser.add_argument('--min_nrof_val_images_per_class', type=float, + help='Classes with fewer images will be removed from the validation set', default=0) + + # Parameters for validation on LFW + parser.add_argument('--lfw_pairs', type=str, + help='The file containing the pairs to use for validation.', default='data/pairs.txt') + parser.add_argument('--lfw_dir', type=str, + help='Path to the data directory containing aligned face patches.', default='') + parser.add_argument('--lfw_batch_size', type=int, + help='Number of images to process in a batch in the LFW test set.', default=100) + parser.add_argument('--lfw_nrof_folds', type=int, + help='Number of folds to use for cross validation. Mainly used for testing.', default=10) + parser.add_argument('--lfw_distance_metric', type=int, + help='Type of distance metric to use. 0: Euclidian, 1:Cosine similarity distance.', default=0) + parser.add_argument('--lfw_use_flipped_images', + help='Concatenates embeddings for the image and its horizontally flipped counterpart.', action='store_true') + parser.add_argument('--lfw_subtract_mean', + help='Subtract feature mean before calculating distance.', action='store_true') + return parser.parse_args(argv) + + +if __name__ == '__main__': + main(parse_arguments(sys.argv[1:])) From 805d6465596eb57a8a1c588b68bddf2a60b95a12 Mon Sep 17 00:00:00 2001 From: Sarattha K-Main PC Date: Thu, 29 Apr 2021 22:24:53 +0700 Subject: [PATCH 5/7] Update to use TF2.4 for both train and evaluation --- .gitignore | 11 +- README.md | 56 +++- requirements.txt | 54 +++- src/align/align_dataset_mtcnn.py | 14 +- src/align/detect_face.py | 34 +- src/calculate_filtering_metrics.py | 8 +- src/compare.py | 12 +- src/facenet.py | 58 ++-- src/freeze_graph.py | 12 +- src/models/inception_resnet_v1.py | 46 +-- src/train_softmax.py | 12 +- src/validate_on_lfw.py | 16 +- src_v2/facenet.py | 463 --------------------------- src_v2/models/inception_resnet_v1.py | 63 ---- src_v2/train_softmax.py | 404 ----------------------- tmp/align_dataset.m | 178 ++++++++++ tmp/detect_face_v1.m | 253 +++++++++++++++ tmp/detect_face_v2.m | 288 +++++++++++++++++ util/plot_learning_curves.m | 300 +++++++++++++++++ 19 files changed, 1233 insertions(+), 1049 deletions(-) delete mode 100644 src_v2/facenet.py delete mode 100644 src_v2/models/inception_resnet_v1.py delete mode 100644 src_v2/train_softmax.py create mode 100644 tmp/align_dataset.m create mode 100644 tmp/detect_face_v1.m create mode 100644 tmp/detect_face_v2.m create mode 100644 util/plot_learning_curves.m diff --git a/.gitignore b/.gitignore index 151e868c7..c229d37d0 100644 --- a/.gitignore +++ b/.gitignore @@ -93,8 +93,11 @@ ENV/ # PyCharm project setting .idea -datasets -datasets/* -logs -logs/* +# log +logs/ + +# model +models/ + +# vscode .vscode diff --git a/README.md b/README.md index 8e38aab05..9220d20e4 100644 --- a/README.md +++ b/README.md @@ -1 +1,55 @@ -# Face Recognition using Tensorflow v2 [![Build Status][travis-image]][travis] \ No newline at end of file +# Face Recognition using Tensorflow [![Build Status][travis-image]][travis] + +[travis-image]: http://travis-ci.org/davidsandberg/facenet.svg?branch=master +[travis]: http://travis-ci.org/davidsandberg/facenet + +This is a TensorFlow implementation of the face recognizer described in the paper +["FaceNet: A Unified Embedding for Face Recognition and Clustering"](http://arxiv.org/abs/1503.03832). The project also uses ideas from the paper ["Deep Face Recognition"](http://www.robots.ox.ac.uk/~vgg/publications/2015/Parkhi15/parkhi15.pdf) from the [Visual Geometry Group](http://www.robots.ox.ac.uk/~vgg/) at Oxford. + +## Compatibility +The code is tested using Tensorflow r1.7 under Ubuntu 14.04 with Python 2.7 and Python 3.5. The test cases can be found [here](https://github.com/davidsandberg/facenet/tree/master/test) and the results can be found [here](http://travis-ci.org/davidsandberg/facenet). + +## News +| Date | Update | +|----------|--------| +| 2018-04-10 | Added new models trained on Casia-WebFace and VGGFace2 (see below). Note that the models uses fixed image standardization (see [wiki](https://github.com/davidsandberg/facenet/wiki/Training-using-the-VGGFace2-dataset)). | +| 2018-03-31 | Added a new, more flexible input pipeline as well as a bunch of minor updates. | +| 2017-05-13 | Removed a bunch of older non-slim models. Moved the last bottleneck layer into the respective models. Corrected normalization of Center Loss. | +| 2017-05-06 | Added code to [train a classifier on your own images](https://github.com/davidsandberg/facenet/wiki/Train-a-classifier-on-own-images). Renamed facenet_train.py to train_tripletloss.py and facenet_train_classifier.py to train_softmax.py. | +| 2017-03-02 | Added pretrained models that generate 128-dimensional embeddings.| +| 2017-02-22 | Updated to Tensorflow r1.0. Added Continuous Integration using Travis-CI.| +| 2017-02-03 | Added models where only trainable variables has been stored in the checkpoint. These are therefore significantly smaller. | +| 2017-01-27 | Added a model trained on a subset of the MS-Celeb-1M dataset. The LFW accuracy of this model is around 0.994. | +| 2017‑01‑02 | Updated to run with Tensorflow r0.12. Not sure if it runs with older versions of Tensorflow though. | + +## Pre-trained models +| Model name | LFW accuracy | Training dataset | Architecture | +|-----------------|--------------|------------------|-------------| +| [20180408-102900](https://drive.google.com/open?id=1R77HmFADxe87GmoLwzfgMu_HY0IhcyBz) | 0.9905 | CASIA-WebFace | [Inception ResNet v1](https://github.com/davidsandberg/facenet/blob/master/src/models/inception_resnet_v1.py) | +| [20180402-114759](https://drive.google.com/open?id=1EXPBSXwTaqrSC0OhUdXNmKSh9qJUQ55-) | 0.9965 | VGGFace2 | [Inception ResNet v1](https://github.com/davidsandberg/facenet/blob/master/src/models/inception_resnet_v1.py) | + +NOTE: If you use any of the models, please do not forget to give proper credit to those providing the training dataset as well. + +## Inspiration +The code is heavily inspired by the [OpenFace](https://github.com/cmusatyalab/openface) implementation. + +## Training data +The [CASIA-WebFace](http://www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html) dataset has been used for training. This training set consists of total of 453 453 images over 10 575 identities after face detection. Some performance improvement has been seen if the dataset has been filtered before training. Some more information about how this was done will come later. +The best performing model has been trained on the [VGGFace2](https://www.robots.ox.ac.uk/~vgg/data/vgg_face2/) dataset consisting of ~3.3M faces and ~9000 classes. + +## Pre-processing + +### Face alignment using MTCNN +One problem with the above approach seems to be that the Dlib face detector misses some of the hard examples (partial occlusion, silhouettes, etc). This makes the training set too "easy" which causes the model to perform worse on other benchmarks. +To solve this, other face landmark detectors has been tested. One face landmark detector that has proven to work very well in this setting is the +[Multi-task CNN](https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html). A Matlab/Caffe implementation can be found [here](https://github.com/kpzhang93/MTCNN_face_detection_alignment) and this has been used for face alignment with very good results. A Python/Tensorflow implementation of MTCNN can be found [here](https://github.com/davidsandberg/facenet/tree/master/src/align). This implementation does not give identical results to the Matlab/Caffe implementation but the performance is very similar. + +## Running training +Currently, the best results are achieved by training the model using softmax loss. Details on how to train a model using softmax loss on the CASIA-WebFace dataset can be found on the page [Classifier training of Inception-ResNet-v1](https://github.com/davidsandberg/facenet/wiki/Classifier-training-of-inception-resnet-v1) and . + +## Pre-trained models +### Inception-ResNet-v1 model +A couple of pretrained models are provided. They are trained using softmax loss with the Inception-Resnet-v1 model. The datasets has been aligned using [MTCNN](https://github.com/davidsandberg/facenet/tree/master/src/align). + +## Performance +The accuracy on LFW for the model [20180402-114759](https://drive.google.com/open?id=1EXPBSXwTaqrSC0OhUdXNmKSh9qJUQ55-) is 0.99650+-0.00252. A description of how to run the test can be found on the page [Validate on LFW](https://github.com/davidsandberg/facenet/wiki/Validate-on-lfw). Note that the input images to the model need to be standardized using fixed image standardization (use the option `--use_fixed_image_standardization` when running e.g. `validate_on_lfw.py`). diff --git a/requirements.txt b/requirements.txt index b7418c9ac..f177a2371 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,9 +1,45 @@ -tensorflow==1.7 -scipy -scikit-learn -opencv-python -h5py -matplotlib -Pillow -requests -psutil +absl-py==0.11.0 +astunparse==1.6.3 +cachetools==4.2.1 +certifi==2020.12.5 +chardet==4.0.0 +cycler==0.10.0 +flatbuffers==1.12 +gast==0.3.3 +google-auth==1.24.0 +google-auth-oauthlib==0.4.2 +google-pasta==0.2.0 +grpcio==1.32.0 +h5py==2.10.0 +idna==2.10 +joblib==1.0.0 +Keras-Preprocessing==1.1.2 +kiwisolver==1.3.1 +Markdown==3.3.3 +matplotlib==3.3.4 +numpy==1.19.5 +oauthlib==3.1.0 +opencv-python==4.5.1.48 +opt-einsum==3.3.0 +Pillow==8.1.0 +protobuf==3.14.0 +pyasn1==0.4.8 +pyasn1-modules==0.2.8 +pyparsing==2.4.7 +python-dateutil==2.8.1 +requests==2.25.1 +requests-oauthlib==1.3.0 +rsa==4.7 +scikit-learn==0.24.1 +scipy==1.6.0 +six==1.15.0 +tensorboard==2.4.1 +tensorboard-plugin-wit==1.8.0 +tensorflow==2.4.1 +tensorflow-estimator==2.4.0 +termcolor==1.1.0 +threadpoolctl==2.1.0 +typing-extensions==3.7.4.3 +urllib3==1.26.3 +Werkzeug==1.0.1 +wrapt==1.12.1 diff --git a/src/align/align_dataset_mtcnn.py b/src/align/align_dataset_mtcnn.py index 7d5e735e6..0307e9f74 100644 --- a/src/align/align_dataset_mtcnn.py +++ b/src/align/align_dataset_mtcnn.py @@ -35,6 +35,7 @@ import align.detect_face import random from time import sleep +from PIL import Image def main(args): sleep(random.random()) @@ -49,8 +50,8 @@ def main(args): print('Creating networks and loading parameters') with tf.Graph().as_default(): - gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) - sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) + gpu_options = tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) + sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) with sess.as_default(): pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None) @@ -80,7 +81,7 @@ def main(args): print(image_path) if not os.path.exists(output_filename): try: - img = misc.imread(image_path) + img = np.array(Image.open(image_path)) except (IOError, ValueError, IndexError) as e: errorMessage = '{}: {}'.format(image_path, e) print(errorMessage) @@ -121,14 +122,15 @@ def main(args): bb[2] = np.minimum(det[2]+args.margin/2, img_size[1]) bb[3] = np.minimum(det[3]+args.margin/2, img_size[0]) cropped = img[bb[1]:bb[3],bb[0]:bb[2],:] - scaled = misc.imresize(cropped, (args.image_size, args.image_size), interp='bilinear') + cropped = Image.fromarray(np.uint8(cropped)) + scaled = cropped.resize((args.image_size, args.image_size), Image.ANTIALIAS) nrof_successfully_aligned += 1 filename_base, file_extension = os.path.splitext(output_filename) if args.detect_multiple_faces: output_filename_n = "{}_{}{}".format(filename_base, i, file_extension) else: output_filename_n = "{}{}".format(filename_base, file_extension) - misc.imsave(output_filename_n, scaled) + scaled.save(output_filename_n) text_file.write('%s %d %d %d %d\n' % (output_filename_n, bb[0], bb[1], bb[2], bb[3])) else: print('Unable to align "%s"' % image_path) @@ -150,7 +152,7 @@ def parse_arguments(argv): parser.add_argument('--random_order', help='Shuffles the order of images to enable alignment using multiple processes.', action='store_true') parser.add_argument('--gpu_memory_fraction', type=float, - help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0) + help='Upper bound on the amount of GPU memory that will be used by the process.', default=0.8) parser.add_argument('--detect_multiple_faces', type=bool, help='Detect and align multiple faces per image.', default=False) return parser.parse_args(argv) diff --git a/src/align/detect_face.py b/src/align/detect_face.py index 7f98ca7fb..df1d58e51 100644 --- a/src/align/detect_face.py +++ b/src/align/detect_face.py @@ -82,13 +82,13 @@ def load(self, data_path, session, ignore_missing=False): session: The current TensorFlow session ignore_missing: If true, serialized weights for missing layers are ignored. """ - data_dict = np.load(data_path, encoding='latin1').item() #pylint: disable=no-member + data_dict = np.load(data_path, encoding='latin1', allow_pickle=True).item() #pylint: disable=no-member for op_name in data_dict: - with tf.variable_scope(op_name, reuse=True): + with tf.compat.v1.variable_scope(op_name, reuse=True): for param_name, data in iteritems(data_dict[op_name]): try: - var = tf.get_variable(param_name) + var = tf.compat.v1.get_variable(param_name) session.run(var.assign(data)) except ValueError: if not ignore_missing: @@ -122,7 +122,7 @@ def get_unique_name(self, prefix): def make_var(self, name, shape): """Creates a new TensorFlow variable.""" - return tf.get_variable(name, shape, trainable=self.trainable) + return tf.compat.v1.get_variable(name, shape, trainable=self.trainable) def validate_padding(self, padding): """Verifies that the padding is one of the supported ones.""" @@ -150,7 +150,7 @@ def conv(self, assert c_o % group == 0 # Convolution for a given input and kernel convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding) - with tf.variable_scope(name) as scope: + with tf.compat.v1.variable_scope(name) as scope: kernel = self.make_var('weights', shape=[k_h, k_w, c_i // group, c_o]) # This is the common-case. Convolve the input without any further complications. output = convolve(inp, kernel) @@ -165,7 +165,7 @@ def conv(self, @layer def prelu(self, inp, name): - with tf.variable_scope(name): + with tf.compat.v1.variable_scope(name): i = int(inp.get_shape()[-1]) alpha = self.make_var('alpha', shape=(i,)) output = tf.nn.relu(inp) + tf.multiply(alpha, -tf.nn.relu(-inp)) @@ -182,7 +182,7 @@ def max_pool(self, inp, k_h, k_w, s_h, s_w, name, padding='SAME'): @layer def fc(self, inp, num_out, name, relu=True): - with tf.variable_scope(name): + with tf.compat.v1.variable_scope(name): input_shape = inp.get_shape() if input_shape.ndims == 4: # The input is spatial. Vectorize it first. @@ -191,10 +191,10 @@ def fc(self, inp, num_out, name, relu=True): dim *= int(d) feed_in = tf.reshape(inp, [-1, dim]) else: - feed_in, dim = (inp, input_shape[-1].value) + feed_in, dim = (inp, input_shape.as_list()[-1]) weights = self.make_var('weights', shape=[dim, num_out]) biases = self.make_var('biases', [num_out]) - op = tf.nn.relu_layer if relu else tf.nn.xw_plus_b + op = tf.nn.relu_layer if relu else tf.compat.v1.nn.xw_plus_b fc = op(feed_in, weights, biases, name=name) return fc @@ -210,7 +210,7 @@ def softmax(self, target, axis, name=None): max_axis = tf.reduce_max(target, axis, keepdims=True) target_exp = tf.exp(target-max_axis) normalize = tf.reduce_sum(target_exp, axis, keepdims=True) - softmax = tf.div(target_exp, normalize, name) + softmax = tf.compat.v1.div(target_exp, normalize, name) return softmax class PNet(Network): @@ -277,16 +277,16 @@ def create_mtcnn(sess, model_path): if not model_path: model_path,_ = os.path.split(os.path.realpath(__file__)) - with tf.variable_scope('pnet'): - data = tf.placeholder(tf.float32, (None,None,None,3), 'input') + with tf.compat.v1.variable_scope('pnet'): + data = tf.compat.v1.placeholder(tf.float32, (None,None,None,3), 'input') pnet = PNet({'data':data}) pnet.load(os.path.join(model_path, 'det1.npy'), sess) - with tf.variable_scope('rnet'): - data = tf.placeholder(tf.float32, (None,24,24,3), 'input') + with tf.compat.v1.variable_scope('rnet'): + data = tf.compat.v1.placeholder(tf.float32, (None,24,24,3), 'input') rnet = RNet({'data':data}) rnet.load(os.path.join(model_path, 'det2.npy'), sess) - with tf.variable_scope('onet'): - data = tf.placeholder(tf.float32, (None,48,48,3), 'input') + with tf.compat.v1.variable_scope('onet'): + data = tf.compat.v1.placeholder(tf.float32, (None,48,48,3), 'input') onet = ONet({'data':data}) onet.load(os.path.join(model_path, 'det3.npy'), sess) @@ -708,7 +708,7 @@ def nms(boxes, threshold, method): w = np.maximum(0.0, xx2-xx1+1) h = np.maximum(0.0, yy2-yy1+1) inter = w * h - if method is 'Min': + if method == 'Min': o = inter / np.minimum(area[i], area[idx]) else: o = inter / (area[i] + area[idx] - inter) diff --git a/src/calculate_filtering_metrics.py b/src/calculate_filtering_metrics.py index 9916fa726..f60b9ae4d 100644 --- a/src/calculate_filtering_metrics.py +++ b/src/calculate_filtering_metrics.py @@ -54,15 +54,15 @@ def main(args): model_exp = os.path.expanduser(args.model_file) with gfile.FastGFile(model_exp,'rb') as f: - graph_def = tf.compat.v1.GraphDef() + graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) input_map={'input':image_batch, 'phase_train':False} tf.import_graph_def(graph_def, input_map=input_map, name='net') - embeddings = tf.compat.v1.get_default_graph().get_tensor_by_name("net/embeddings:0") + embeddings = tf.get_default_graph().get_tensor_by_name("net/embeddings:0") - with tf.compat.v1.Session() as sess: - tf.compat.v1.train.start_queue_runners(sess=sess) + with tf.Session() as sess: + tf.train.start_queue_runners(sess=sess) embedding_size = int(embeddings.get_shape()[1]) nrof_batches = int(math.ceil(nrof_images / args.batch_size)) diff --git a/src/compare.py b/src/compare.py index 6b10f4e5d..bc53cc421 100644 --- a/src/compare.py +++ b/src/compare.py @@ -41,15 +41,15 @@ def main(args): images = load_and_align_data(args.image_files, args.image_size, args.margin, args.gpu_memory_fraction) with tf.Graph().as_default(): - with tf.compat.v1.Session() as sess: + with tf.Session() as sess: # Load the model facenet.load_model(args.model) # Get input and output tensors - images_placeholder = tf.compat.v1.get_default_graph().get_tensor_by_name("input:0") - embeddings = tf.compat.v1.get_default_graph().get_tensor_by_name("embeddings:0") - phase_train_placeholder = tf.compat.v1.get_default_graph().get_tensor_by_name("phase_train:0") + images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") + embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") + phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") # Run forward pass to calculate embeddings feed_dict = { images_placeholder: images, phase_train_placeholder:False } @@ -84,8 +84,8 @@ def load_and_align_data(image_paths, image_size, margin, gpu_memory_fraction): print('Creating networks and loading parameters') with tf.Graph().as_default(): - gpu_options = tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=gpu_memory_fraction) - sess = tf.Session(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) + gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_memory_fraction) + sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) with sess.as_default(): pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None) diff --git a/src/facenet.py b/src/facenet.py index d36ac4b55..4019d8977 100644 --- a/src/facenet.py +++ b/src/facenet.py @@ -40,6 +40,7 @@ from tensorflow.python.platform import gfile import math from six import iteritems +from PIL import Image def triplet_loss(anchor, positive, negative, alpha): """Calculate the triplet loss according to the FaceNet paper @@ -53,11 +54,11 @@ def triplet_loss(anchor, positive, negative, alpha): the triplet loss according to the FaceNet paper as a float tensor. """ with tf.compat.v1.variable_scope('triplet_loss'): - ppos_dist = tf.reduce_sum(input_tensor=tf.square(tf.subtract(anchor, positive)), axis=1) - pos_dist = tf.reduce_sum(input_tensor=tf.square(tf.subtract(anchor, positive)), axis=1) + pos_dist = tf.compat.v1.reduce_sum(tf.square(tf.subtract(anchor, positive)), 1) + neg_dist = tf.compat.v1.reduce_sum(tf.square(tf.subtract(anchor, negative)), 1) basic_loss = tf.add(tf.subtract(pos_dist,neg_dist), alpha) - loss = tf.reduce_mean(input_tensor=tf.maximum(basic_loss, 0.0), axis=0) + loss = tf.reduce_mean(tf.maximum(basic_loss, 0.0), 0) return loss @@ -66,14 +67,14 @@ def center_loss(features, label, alfa, nrof_classes): (http://ydwen.github.io/papers/WenECCV16.pdf) """ nrof_features = features.get_shape()[1] - centers = tf.compat.v1.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32, use_resource=False, - initializer=tf.compat.v1.constant_initializer(0), trainable=False) + centers = tf.compat.v1.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32, + initializer=tf.constant_initializer(0), trainable=False) label = tf.reshape(label, [-1]) centers_batch = tf.gather(centers, label) diff = (1 - alfa) * (centers_batch - features) centers = tf.compat.v1.scatter_sub(centers, label, diff) with tf.control_dependencies([centers]): - loss = tf.reduce_mean(input_tensor=tf.square(features - centers_batch)) + loss = tf.reduce_mean(tf.square(features - centers_batch)) return loss, centers def get_image_paths_and_labels(dataset): @@ -108,22 +109,21 @@ def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batc for filename in tf.unstack(filenames): file_contents = tf.io.read_file(filename) image = tf.image.decode_image(file_contents, 3) - image = tf.cond(pred=get_control_flag(control[0], RANDOM_ROTATE), - true_fn=lambda:tf.compat.v1.py_func(random_rotate_image, [image], tf.uint8), - false_fn=lambda:tf.identity(image)) - image = tf.cond(pred=get_control_flag(control[0], RANDOM_CROP), - true_fn=lambda:tf.image.random_crop(image, image_size + (3,)), - false_fn=lambda:tf.image.resize_with_crop_or_pad(image, image_size[0], image_size[1])) - image = tf.cond(pred=get_control_flag(control[0], RANDOM_FLIP), - true_fn=lambda:tf.image.random_flip_left_right(image), - false_fn=lambda:tf.identity(image)) - image = tf.cast(image, tf.float32) - image = tf.cond(pred=get_control_flag(control[0], FIXED_STANDARDIZATION), - true_fn=lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, - false_fn=lambda:tf.image.per_image_standardization(image)) - image = tf.cond(pred=get_control_flag(control[0], FLIP), - true_fn=lambda:tf.image.flip_left_right(image), - false_fn=lambda:tf.identity(image)) + image = tf.cond(get_control_flag(control[0], RANDOM_ROTATE), + lambda:tf.py_function(random_rotate_image, [image], tf.uint8), + lambda:tf.identity(image)) + image = tf.cond(get_control_flag(control[0], RANDOM_CROP), + lambda:tf.image.random_crop(image, image_size + (3,)), + lambda:tf.image.resize_with_crop_or_pad(image, image_size[0], image_size[1])) + image = tf.cond(get_control_flag(control[0], RANDOM_FLIP), + lambda:tf.image.random_flip_left_right(image), + lambda:tf.identity(image)) + image = tf.cond(get_control_flag(control[0], FIXED_STANDARDIZATION), + lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, + lambda:tf.cast(tf.image.per_image_standardization(image), tf.float32)) + image = tf.cond(get_control_flag(control[0], FLIP), + lambda:tf.image.flip_left_right(image), + lambda:tf.identity(image)) #pylint: disable=no-member image.set_shape(image_size + (3,)) images.append(image) @@ -138,7 +138,7 @@ def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batc return image_batch, label_batch def get_control_flag(control, field): - return tf.equal(tf.math.mod(tf.math.floordiv(control, field), 2), 1) + return tf.equal(tf.math.floormod(tf.math.floordiv(control, field), 2), 1) def _add_loss_summaries(total_loss): """Add summaries for losses. @@ -161,8 +161,8 @@ def _add_loss_summaries(total_loss): for l in losses + [total_loss]: # Name each loss as '(raw)' and name the moving average version of the loss # as the original loss name. - tf.compat.v1.summary.scalar(l.op.name +' (raw)', l) - tf.compat.v1.summary.scalar(l.op.name, loss_averages.average(l)) + tf.summary.scalar(l.op.name +' (raw)', l) + tf.summary.scalar(l.op.name, loss_averages.average(l)) return loss_averages_op @@ -193,13 +193,13 @@ def train(total_loss, global_step, optimizer, learning_rate, moving_average_deca # Add histograms for trainable variables. if log_histograms: for var in tf.compat.v1.trainable_variables(): - tf.compat.v1.summary.histogram(var.op.name, var) + tf.summary.histogram(var.op.name, var) # Add histograms for gradients. if log_histograms: for grad, var in grads: if grad is not None: - tf.compat.v1.summary.histogram(var.op.name + '/gradients', grad) + tf.summary.histogram(var.op.name + '/gradients', grad) # Track the moving averages of all trainable variables. variable_averages = tf.train.ExponentialMovingAverage( @@ -245,7 +245,7 @@ def load_data(image_paths, do_random_crop, do_random_flip, image_size, do_prewhi nrof_samples = len(image_paths) images = np.zeros((nrof_samples, image_size, image_size, 3)) for i in range(nrof_samples): - img = misc.imread(image_paths[i]) + img = np.array(Image.open(image_paths[i])) if img.ndim == 2: img = to_rgb(img) if do_prewhiten: @@ -379,7 +379,7 @@ def load_model(model, input_map=None): print('Metagraph file: %s' % meta_file) print('Checkpoint file: %s' % ckpt_file) - saver = tf.compat.v1.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map) + saver = tf.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map) saver.restore(tf.compat.v1.get_default_session(), os.path.join(model_exp, ckpt_file)) def get_model_filenames(model_dir): diff --git a/src/freeze_graph.py b/src/freeze_graph.py index c59517df1..3584c186e 100644 --- a/src/freeze_graph.py +++ b/src/freeze_graph.py @@ -37,7 +37,7 @@ def main(args): with tf.Graph().as_default(): - with tf.compat.v1.Session() as sess: + with tf.Session() as sess: # Load the model metagraph and checkpoint print('Model directory: %s' % args.model_dir) meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.model_dir)) @@ -46,10 +46,10 @@ def main(args): print('Checkpoint file: %s' % ckpt_file) model_dir_exp = os.path.expanduser(args.model_dir) - saver = tf.compat.v1.train.import_meta_graph(os.path.join(model_dir_exp, meta_file), clear_devices=True) - tf.compat.v1.get_default_session().run(tf.compat.v1.global_variables_initializer()) - tf.compat.v1.get_default_session().run(tf.compat.v1.local_variables_initializer()) - saver.restore(tf.compat.v1.get_default_session(), os.path.join(model_dir_exp, ckpt_file)) + saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file), clear_devices=True) + tf.get_default_session().run(tf.global_variables_initializer()) + tf.get_default_session().run(tf.local_variables_initializer()) + saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file)) # Retrieve the protobuf graph definition and fix the batch norm nodes input_graph_def = sess.graph.as_graph_def() @@ -58,7 +58,7 @@ def main(args): output_graph_def = freeze_graph_def(sess, input_graph_def, 'embeddings,label_batch') # Serialize and dump the output graph to the filesystem - with tf.io.gfile.GFile(args.output_file, 'wb') as f: + with tf.gfile.GFile(args.output_file, 'wb') as f: f.write(output_graph_def.SerializeToString()) print("%d ops in the final graph: %s" % (len(output_graph_def.node), args.output_file)) diff --git a/src/models/inception_resnet_v1.py b/src/models/inception_resnet_v1.py index 475e81bb4..b1df50436 100644 --- a/src/models/inception_resnet_v1.py +++ b/src/models/inception_resnet_v1.py @@ -24,18 +24,18 @@ from __future__ import print_function import tensorflow as tf -import tensorflow.contrib.slim as slim +import tf_slim as slim # Inception-Resnet-A def block35(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): """Builds the 35x35 resnet block.""" - with tf.variable_scope(scope, 'Block35', [net], reuse=reuse): - with tf.variable_scope('Branch_0'): + with tf.compat.v1.variable_scope(scope, 'Block35', [net], reuse=reuse): + with tf.compat.v1.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 32, 1, scope='Conv2d_1x1') - with tf.variable_scope('Branch_1'): + with tf.compat.v1.variable_scope('Branch_1'): tower_conv1_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1_0, 32, 3, scope='Conv2d_0b_3x3') - with tf.variable_scope('Branch_2'): + with tf.compat.v1.variable_scope('Branch_2'): tower_conv2_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1') tower_conv2_1 = slim.conv2d(tower_conv2_0, 32, 3, scope='Conv2d_0b_3x3') tower_conv2_2 = slim.conv2d(tower_conv2_1, 32, 3, scope='Conv2d_0c_3x3') @@ -50,10 +50,10 @@ def block35(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): # Inception-Resnet-B def block17(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): """Builds the 17x17 resnet block.""" - with tf.variable_scope(scope, 'Block17', [net], reuse=reuse): - with tf.variable_scope('Branch_0'): + with tf.compat.v1.variable_scope(scope, 'Block17', [net], reuse=reuse): + with tf.compat.v1.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 128, 1, scope='Conv2d_1x1') - with tf.variable_scope('Branch_1'): + with tf.compat.v1.variable_scope('Branch_1'): tower_conv1_0 = slim.conv2d(net, 128, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1_0, 128, [1, 7], scope='Conv2d_0b_1x7') @@ -71,10 +71,10 @@ def block17(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): # Inception-Resnet-C def block8(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): """Builds the 8x8 resnet block.""" - with tf.variable_scope(scope, 'Block8', [net], reuse=reuse): - with tf.variable_scope('Branch_0'): + with tf.compat.v1.variable_scope(scope, 'Block8', [net], reuse=reuse): + with tf.compat.v1.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1') - with tf.variable_scope('Branch_1'): + with tf.compat.v1.variable_scope('Branch_1'): tower_conv1_0 = slim.conv2d(net, 192, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1_0, 192, [1, 3], scope='Conv2d_0b_1x3') @@ -89,38 +89,38 @@ def block8(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None): return net def reduction_a(net, k, l, m, n): - with tf.variable_scope('Branch_0'): + with tf.compat.v1.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, n, 3, stride=2, padding='VALID', scope='Conv2d_1a_3x3') - with tf.variable_scope('Branch_1'): + with tf.compat.v1.variable_scope('Branch_1'): tower_conv1_0 = slim.conv2d(net, k, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1_0, l, 3, scope='Conv2d_0b_3x3') tower_conv1_2 = slim.conv2d(tower_conv1_1, m, 3, stride=2, padding='VALID', scope='Conv2d_1a_3x3') - with tf.variable_scope('Branch_2'): + with tf.compat.v1.variable_scope('Branch_2'): tower_pool = slim.max_pool2d(net, 3, stride=2, padding='VALID', scope='MaxPool_1a_3x3') net = tf.concat([tower_conv, tower_conv1_2, tower_pool], 3) return net def reduction_b(net): - with tf.variable_scope('Branch_0'): + with tf.compat.v1.variable_scope('Branch_0'): tower_conv = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') tower_conv_1 = slim.conv2d(tower_conv, 384, 3, stride=2, padding='VALID', scope='Conv2d_1a_3x3') - with tf.variable_scope('Branch_1'): + with tf.compat.v1.variable_scope('Branch_1'): tower_conv1 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') tower_conv1_1 = slim.conv2d(tower_conv1, 256, 3, stride=2, padding='VALID', scope='Conv2d_1a_3x3') - with tf.variable_scope('Branch_2'): + with tf.compat.v1.variable_scope('Branch_2'): tower_conv2 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1') tower_conv2_1 = slim.conv2d(tower_conv2, 256, 3, scope='Conv2d_0b_3x3') tower_conv2_2 = slim.conv2d(tower_conv2_1, 256, 3, stride=2, padding='VALID', scope='Conv2d_1a_3x3') - with tf.variable_scope('Branch_3'): + with tf.compat.v1.variable_scope('Branch_3'): tower_pool = slim.max_pool2d(net, 3, stride=2, padding='VALID', scope='MaxPool_1a_3x3') net = tf.concat([tower_conv_1, tower_conv1_1, @@ -137,7 +137,7 @@ def inference(images, keep_probability, phase_train=True, # force in-place updates of mean and variance estimates 'updates_collections': None, # Moving averages ends up in the trainable variables collection - 'variables_collections': [ tf.GraphKeys.TRAINABLE_VARIABLES ], + 'variables_collections': [ tf.compat.v1.GraphKeys.TRAINABLE_VARIABLES ], } with slim.arg_scope([slim.conv2d, slim.fully_connected], @@ -169,7 +169,7 @@ def inception_resnet_v1(inputs, is_training=True, """ end_points = {} - with tf.variable_scope(scope, 'InceptionResnetV1', [inputs], reuse=reuse): + with tf.compat.v1.variable_scope(scope, 'InceptionResnetV1', [inputs], reuse=reuse): with slim.arg_scope([slim.batch_norm, slim.dropout], is_training=is_training): with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], @@ -208,7 +208,7 @@ def inception_resnet_v1(inputs, is_training=True, end_points['Mixed_5a'] = net # Reduction-A - with tf.variable_scope('Mixed_6a'): + with tf.compat.v1.variable_scope('Mixed_6a'): net = reduction_a(net, 192, 192, 256, 384) end_points['Mixed_6a'] = net @@ -217,7 +217,7 @@ def inception_resnet_v1(inputs, is_training=True, end_points['Mixed_6b'] = net # Reduction-B - with tf.variable_scope('Mixed_7a'): + with tf.compat.v1.variable_scope('Mixed_7a'): net = reduction_b(net) end_points['Mixed_7a'] = net @@ -228,7 +228,7 @@ def inception_resnet_v1(inputs, is_training=True, net = block8(net, activation_fn=None) end_points['Mixed_8b'] = net - with tf.variable_scope('Logits'): + with tf.compat.v1.variable_scope('Logits'): end_points['PrePool'] = net #pylint: disable=no-member net = slim.avg_pool2d(net, net.get_shape()[1:3], padding='VALID', diff --git a/src/train_softmax.py b/src/train_softmax.py index 2e4a7d928..bf712d6f2 100644 --- a/src/train_softmax.py +++ b/src/train_softmax.py @@ -39,7 +39,7 @@ import lfw import h5py import math -# import tensorflow.contrib.slim as slim +import tf_slim as slim from tensorflow.python.ops import data_flow_ops from tensorflow.python.framework import ops from tensorflow.python.ops import array_ops @@ -143,9 +143,9 @@ def main(args): prelogits, _ = network.inference(image_batch, args.keep_probability, phase_train=phase_train_placeholder, bottleneck_layer_size=args.embedding_size, weight_decay=args.weight_decay) - logits = tf.keras.layers.Dense(prelogits, len(train_set), activation_fn=None, - weights_initializer=tf.initializers.glorot_uniform(), - weights_regularizer=tf.keras.regularizers.l2(0.5 * (args.weight_decay)), + logits = slim.fully_connected(prelogits, len(train_set), activation_fn=None, + weights_initializer=slim.initializers.xavier_initializer(), + weights_regularizer=slim.l2_regularizer(args.weight_decay), scope='Logits', reuse=False) embeddings = tf.nn.l2_normalize(prelogits, 1, 1e-10, name='embeddings') @@ -188,7 +188,7 @@ def main(args): # Start running operations on the Graph. gpu_options = tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction) - sess = tf.compat.v1.Session(config=tf.compat.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) + sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options, log_device_placement=False)) sess.run(tf.compat.v1.global_variables_initializer()) sess.run(tf.compat.v1.local_variables_initializer()) summary_writer = tf.compat.v1.summary.FileWriter(log_dir, sess.graph) @@ -257,7 +257,7 @@ def main(args): print('Saving statistics') with h5py.File(stat_file_name, 'w') as f: - for key, value in stat.iteritems(): + for key, value in stat.items(): f.create_dataset(key, data=value) return model_dir diff --git a/src/validate_on_lfw.py b/src/validate_on_lfw.py index f3eea4427..ac456c5f6 100644 --- a/src/validate_on_lfw.py +++ b/src/validate_on_lfw.py @@ -45,7 +45,7 @@ def main(args): with tf.Graph().as_default(): - with tf.compat.v1.Session() as sess: + with tf.Session() as sess: # Read the file containing the pairs used for testing pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs)) @@ -53,11 +53,11 @@ def main(args): # Get the paths for the corresponding images paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs) - image_paths_placeholder = tf.compat.v1.placeholder(tf.string, shape=(None,1), name='image_paths') - labels_placeholder = tf.compat.v1.placeholder(tf.int32, shape=(None,1), name='labels') - batch_size_placeholder = tf.compat.v1.placeholder(tf.int32, name='batch_size') - control_placeholder = tf.compat.v1.placeholder(tf.int32, shape=(None,1), name='control') - phase_train_placeholder = tf.compat.v1.placeholder(tf.bool, name='phase_train') + image_paths_placeholder = tf.placeholder(tf.string, shape=(None,1), name='image_paths') + labels_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='labels') + batch_size_placeholder = tf.placeholder(tf.int32, name='batch_size') + control_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='control') + phase_train_placeholder = tf.placeholder(tf.bool, name='phase_train') nrof_preprocess_threads = 4 image_size = (args.image_size, args.image_size) @@ -73,10 +73,10 @@ def main(args): facenet.load_model(args.model, input_map=input_map) # Get output tensor - embeddings = tf.compat.v1.get_default_graph().get_tensor_by_name("embeddings:0") + embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") # coord = tf.train.Coordinator() - tf.compat.v1.train.start_queue_runners(coord=coord, sess=sess) + tf.train.start_queue_runners(coord=coord, sess=sess) evaluate(sess, eval_enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, embeddings, label_batch, paths, actual_issame, args.lfw_batch_size, args.lfw_nrof_folds, args.distance_metric, args.subtract_mean, diff --git a/src_v2/facenet.py b/src_v2/facenet.py deleted file mode 100644 index 5407ce8f9..000000000 --- a/src_v2/facenet.py +++ /dev/null @@ -1,463 +0,0 @@ -"""Functions for building the face recognition network. -""" -# MIT License -# -# Copyright (c) 2016 David Sandberg -# -# Permission is hereby granted, free of charge, to any person obtaining a copy -# of this software and associated documentation files (the "Software"), to deal -# in the Software without restriction, including without limitation the rights -# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -# copies of the Software, and to permit persons to whom the Software is -# furnished to do so, subject to the following conditions: -# -# The above copyright notice and this permission notice shall be included in all -# copies or substantial portions of the Software. -# -# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -# SOFTWARE. - -# pylint: disable=missing-docstring -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os -from subprocess import Popen, PIPE -import tensorflow as tf -import numpy as np -from scipy import misc -from sklearn.model_selection import KFold -from scipy import interpolate -from tensorflow.python.training import training -import random -import re -from tensorflow.python.platform import gfile -import math -from six import iteritems - -def get_image_paths_and_labels(dataset): - image_paths_flat = [] - labels_flat = [] - for i in range(len(dataset)): - image_paths_flat += dataset[i].image_paths - labels_flat += [i] * len(dataset[i].image_paths) - return image_paths_flat, labels_flat - -def shuffle_examples(image_paths, labels): - shuffle_list = list(zip(image_paths, labels)) - random.shuffle(shuffle_list) - image_paths_shuff, labels_shuff = zip(*shuffle_list) - return image_paths_shuff, labels_shuff - -def random_rotate_image(image): - angle = np.random.uniform(low=-10.0, high=10.0) - return misc.imrotate(image, angle, 'bicubic') - -# 1: Random rotate 2: Random crop 4: Random flip 8: Fixed image standardization 16: Flip -RANDOM_ROTATE = 1 -RANDOM_CROP = 2 -RANDOM_FLIP = 4 -FIXED_STANDARDIZATION = 8 -FLIP = 16 -def create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder): - images_and_labels_list = [] - for _ in range(nrof_preprocess_threads): - filenames, label, control = input_queue.dequeue() - images = [] - for filename in tf.unstack(filenames): - file_contents = tf.io.read_file(filename) - image = tf.image.decode_image(file_contents, 3) - image = tf.cond(pred=get_control_flag(control[0], RANDOM_ROTATE), - true_fn=lambda:tf.compat.v1.py_func(random_rotate_image, [image], tf.uint8), - false_fn=lambda:tf.identity(image)) - image = tf.cond(pred=get_control_flag(control[0], RANDOM_CROP), - true_fn=lambda:tf.image.random_crop(image, image_size + (3,)), - false_fn=lambda:tf.image.resize_with_crop_or_pad(image, image_size[0], image_size[1])) - image = tf.cond(pred=get_control_flag(control[0], RANDOM_FLIP), - true_fn=lambda:tf.image.random_flip_left_right(image), - false_fn=lambda:tf.identity(image)) - image = tf.cast(image, tf.float32) - image = tf.cond(pred=get_control_flag(control[0], FIXED_STANDARDIZATION), - true_fn=lambda:(tf.cast(image, tf.float32) - 127.5)/128.0, - false_fn=lambda:tf.image.per_image_standardization(image)) - image = tf.cond(pred=get_control_flag(control[0], FLIP), - true_fn=lambda:tf.image.flip_left_right(image), - false_fn=lambda:tf.identity(image)) - #pylint: disable=no-member - image.set_shape(image_size + (3,)) - images.append(image) - images_and_labels_list.append([images, label]) - - image_batch, label_batch = tf.compat.v1.train.batch_join( - images_and_labels_list, batch_size=batch_size_placeholder, - shapes=[image_size + (3,), ()], enqueue_many=True, - capacity=4 * nrof_preprocess_threads * 100, - allow_smaller_final_batch=True) - - return image_batch, label_batch - -def get_control_flag(control, field): - return tf.equal(tf.math.mod(tf.math.floordiv(control, field), 2), 1) - -def prewhiten(x): - mean = np.mean(x) - std = np.std(x) - std_adj = np.maximum(std, 1.0/np.sqrt(x.size)) - y = np.multiply(np.subtract(x, mean), 1/std_adj) - return y - -def crop(image, random_crop, image_size): - if image.shape[1]>image_size: - sz1 = int(image.shape[1]//2) - sz2 = int(image_size//2) - if random_crop: - diff = sz1-sz2 - (h, v) = (np.random.randint(-diff, diff+1), np.random.randint(-diff, diff+1)) - else: - (h, v) = (0,0) - image = image[(sz1-sz2+v):(sz1+sz2+v),(sz1-sz2+h):(sz1+sz2+h),:] - return image - -def flip(image, random_flip): - if random_flip and np.random.choice([True, False]): - image = np.fliplr(image) - return image - -def to_rgb(img): - w, h = img.shape - ret = np.empty((w, h, 3), dtype=np.uint8) - ret[:, :, 0] = ret[:, :, 1] = ret[:, :, 2] = img - return ret - -def load_data(image_paths, do_random_crop, do_random_flip, image_size, do_prewhiten=True): - nrof_samples = len(image_paths) - images = np.zeros((nrof_samples, image_size, image_size, 3)) - for i in range(nrof_samples): - img = misc.imread(image_paths[i]) - if img.ndim == 2: - img = to_rgb(img) - if do_prewhiten: - img = prewhiten(img) - img = crop(img, do_random_crop, image_size) - img = flip(img, do_random_flip) - images[i,:,:,:] = img - return images - -def get_label_batch(label_data, batch_size, batch_index): - nrof_examples = np.size(label_data, 0) - j = batch_index*batch_size % nrof_examples - if j+batch_size<=nrof_examples: - batch = label_data[j:j+batch_size] - else: - x1 = label_data[j:nrof_examples] - x2 = label_data[0:nrof_examples-j] - batch = np.vstack([x1,x2]) - batch_int = batch.astype(np.int64) - return batch_int - -def get_batch(image_data, batch_size, batch_index): - nrof_examples = np.size(image_data, 0) - j = batch_index*batch_size % nrof_examples - if j+batch_size<=nrof_examples: - batch = image_data[j:j+batch_size,:,:,:] - else: - x1 = image_data[j:nrof_examples,:,:,:] - x2 = image_data[0:nrof_examples-j,:,:,:] - batch = np.vstack([x1,x2]) - batch_float = batch.astype(np.float32) - return batch_float - -def get_triplet_batch(triplets, batch_index, batch_size): - ax, px, nx = triplets - a = get_batch(ax, int(batch_size/3), batch_index) - p = get_batch(px, int(batch_size/3), batch_index) - n = get_batch(nx, int(batch_size/3), batch_index) - batch = np.vstack([a, p, n]) - return batch - -def get_learning_rate_from_file(filename, epoch): - with open(filename, 'r') as f: - for line in f.readlines(): - line = line.split('#', 1)[0] - if line: - par = line.strip().split(':') - e = int(par[0]) - if par[1]=='-': - lr = -1 - else: - lr = float(par[1]) - if e <= epoch: - learning_rate = lr - else: - return learning_rate - -class ImageClass(): - "Stores the paths to images for a given class" - def __init__(self, name, image_paths): - self.name = name - self.image_paths = image_paths - - def __str__(self): - return self.name + ', ' + str(len(self.image_paths)) + ' images' - - def __len__(self): - return len(self.image_paths) - -def get_dataset(path, has_class_directories=True): - dataset = [] - path_exp = os.path.expanduser(path) - classes = [path for path in os.listdir(path_exp) \ - if os.path.isdir(os.path.join(path_exp, path))] - classes.sort() - nrof_classes = len(classes) - for i in range(nrof_classes): - class_name = classes[i] - facedir = os.path.join(path_exp, class_name) - image_paths = get_image_paths(facedir) - dataset.append(ImageClass(class_name, image_paths)) - - return dataset - -def get_image_paths(facedir): - image_paths = [] - if os.path.isdir(facedir): - images = os.listdir(facedir) - image_paths = [os.path.join(facedir,img) for img in images] - return image_paths - -def split_dataset(dataset, split_ratio, min_nrof_images_per_class, mode): - if mode=='SPLIT_CLASSES': - nrof_classes = len(dataset) - class_indices = np.arange(nrof_classes) - np.random.shuffle(class_indices) - split = int(round(nrof_classes*(1-split_ratio))) - train_set = [dataset[i] for i in class_indices[0:split]] - test_set = [dataset[i] for i in class_indices[split:-1]] - elif mode=='SPLIT_IMAGES': - train_set = [] - test_set = [] - for cls in dataset: - paths = cls.image_paths - np.random.shuffle(paths) - nrof_images_in_class = len(paths) - split = int(math.floor(nrof_images_in_class*(1-split_ratio))) - if split==nrof_images_in_class: - split = nrof_images_in_class-1 - if split>=min_nrof_images_per_class and nrof_images_in_class-split>=1: - train_set.append(ImageClass(cls.name, paths[:split])) - test_set.append(ImageClass(cls.name, paths[split:])) - else: - raise ValueError('Invalid train/test split mode "%s"' % mode) - return train_set, test_set - -def load_model(model, input_map=None): - # Check if the model is a model directory (containing a metagraph and a checkpoint file) - # or if it is a protobuf file with a frozen graph - model_exp = os.path.expanduser(model) - if (os.path.isfile(model_exp)): - print('Model filename: %s' % model_exp) - with gfile.FastGFile(model_exp,'rb') as f: - graph_def = tf.compat.v1.GraphDef() - graph_def.ParseFromString(f.read()) - tf.import_graph_def(graph_def, input_map=input_map, name='') - else: - print('Model directory: %s' % model_exp) - meta_file, ckpt_file = get_model_filenames(model_exp) - - print('Metagraph file: %s' % meta_file) - print('Checkpoint file: %s' % ckpt_file) - - saver = tf.compat.v1.train.import_meta_graph(os.path.join(model_exp, meta_file), input_map=input_map) - saver.restore(tf.compat.v1.get_default_session(), os.path.join(model_exp, ckpt_file)) - -def get_model_filenames(model_dir): - files = os.listdir(model_dir) - meta_files = [s for s in files if s.endswith('.meta')] - if len(meta_files)==0: - raise ValueError('No meta file found in the model directory (%s)' % model_dir) - elif len(meta_files)>1: - raise ValueError('There should not be more than one meta file in the model directory (%s)' % model_dir) - meta_file = meta_files[0] - ckpt = tf.train.get_checkpoint_state(model_dir) - if ckpt and ckpt.model_checkpoint_path: - ckpt_file = os.path.basename(ckpt.model_checkpoint_path) - return meta_file, ckpt_file - - meta_files = [s for s in files if '.ckpt' in s] - max_step = -1 - for f in files: - step_str = re.match(r'(^model-[\w\- ]+.ckpt-(\d+))', f) - if step_str is not None and len(step_str.groups())>=2: - step = int(step_str.groups()[1]) - if step > max_step: - max_step = step - ckpt_file = step_str.groups()[0] - return meta_file, ckpt_file - -def distance(embeddings1, embeddings2, distance_metric=0): - if distance_metric==0: - # Euclidian distance - diff = np.subtract(embeddings1, embeddings2) - dist = np.sum(np.square(diff),1) - elif distance_metric==1: - # Distance based on cosine similarity - dot = np.sum(np.multiply(embeddings1, embeddings2), axis=1) - norm = np.linalg.norm(embeddings1, axis=1) * np.linalg.norm(embeddings2, axis=1) - similarity = dot / norm - dist = np.arccos(similarity) / math.pi - else: - raise 'Undefined distance metric %d' % distance_metric - - return dist - -def calculate_roc(thresholds, embeddings1, embeddings2, actual_issame, nrof_folds=10, distance_metric=0, subtract_mean=False): - assert(embeddings1.shape[0] == embeddings2.shape[0]) - assert(embeddings1.shape[1] == embeddings2.shape[1]) - nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) - nrof_thresholds = len(thresholds) - k_fold = KFold(n_splits=nrof_folds, shuffle=False) - - tprs = np.zeros((nrof_folds,nrof_thresholds)) - fprs = np.zeros((nrof_folds,nrof_thresholds)) - accuracy = np.zeros((nrof_folds)) - - indices = np.arange(nrof_pairs) - - for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)): - if subtract_mean: - mean = np.mean(np.concatenate([embeddings1[train_set], embeddings2[train_set]]), axis=0) - else: - mean = 0.0 - dist = distance(embeddings1-mean, embeddings2-mean, distance_metric) - - # Find the best threshold for the fold - acc_train = np.zeros((nrof_thresholds)) - for threshold_idx, threshold in enumerate(thresholds): - _, _, acc_train[threshold_idx] = calculate_accuracy(threshold, dist[train_set], actual_issame[train_set]) - best_threshold_index = np.argmax(acc_train) - for threshold_idx, threshold in enumerate(thresholds): - tprs[fold_idx,threshold_idx], fprs[fold_idx,threshold_idx], _ = calculate_accuracy(threshold, dist[test_set], actual_issame[test_set]) - _, _, accuracy[fold_idx] = calculate_accuracy(thresholds[best_threshold_index], dist[test_set], actual_issame[test_set]) - - tpr = np.mean(tprs,0) - fpr = np.mean(fprs,0) - return tpr, fpr, accuracy - -def calculate_accuracy(threshold, dist, actual_issame): - predict_issame = np.less(dist, threshold) - tp = np.sum(np.logical_and(predict_issame, actual_issame)) - fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) - tn = np.sum(np.logical_and(np.logical_not(predict_issame), np.logical_not(actual_issame))) - fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame)) - - tpr = 0 if (tp+fn==0) else float(tp) / float(tp+fn) - fpr = 0 if (fp+tn==0) else float(fp) / float(fp+tn) - acc = float(tp+tn)/dist.size - return tpr, fpr, acc - -def calculate_val(thresholds, embeddings1, embeddings2, actual_issame, far_target, nrof_folds=10, distance_metric=0, subtract_mean=False): - assert(embeddings1.shape[0] == embeddings2.shape[0]) - assert(embeddings1.shape[1] == embeddings2.shape[1]) - nrof_pairs = min(len(actual_issame), embeddings1.shape[0]) - nrof_thresholds = len(thresholds) - k_fold = KFold(n_splits=nrof_folds, shuffle=False) - - val = np.zeros(nrof_folds) - far = np.zeros(nrof_folds) - - indices = np.arange(nrof_pairs) - - for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)): - if subtract_mean: - mean = np.mean(np.concatenate([embeddings1[train_set], embeddings2[train_set]]), axis=0) - else: - mean = 0.0 - dist = distance(embeddings1-mean, embeddings2-mean, distance_metric) - - # Find the threshold that gives FAR = far_target - far_train = np.zeros(nrof_thresholds) - for threshold_idx, threshold in enumerate(thresholds): - _, far_train[threshold_idx] = calculate_val_far(threshold, dist[train_set], actual_issame[train_set]) - if np.max(far_train)>=far_target: - f = interpolate.interp1d(far_train, thresholds, kind='slinear') - threshold = f(far_target) - else: - threshold = 0.0 - - val[fold_idx], far[fold_idx] = calculate_val_far(threshold, dist[test_set], actual_issame[test_set]) - - val_mean = np.mean(val) - far_mean = np.mean(far) - val_std = np.std(val) - return val_mean, val_std, far_mean - -def calculate_val_far(threshold, dist, actual_issame): - predict_issame = np.less(dist, threshold) - true_accept = np.sum(np.logical_and(predict_issame, actual_issame)) - false_accept = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) - n_same = np.sum(actual_issame) - n_diff = np.sum(np.logical_not(actual_issame)) - val = float(true_accept) / float(n_same) - far = float(false_accept) / float(n_diff) - return val, far - -def store_revision_info(src_path, output_dir, arg_string): - try: - # Get git hash - cmd = ['git', 'rev-parse', 'HEAD'] - gitproc = Popen(cmd, stdout = PIPE, cwd=src_path) - (stdout, _) = gitproc.communicate() - git_hash = stdout.strip() - except OSError as e: - git_hash = ' '.join(cmd) + ': ' + e.strerror - - try: - # Get local changes - cmd = ['git', 'diff', 'HEAD'] - gitproc = Popen(cmd, stdout = PIPE, cwd=src_path) - (stdout, _) = gitproc.communicate() - git_diff = stdout.strip() - except OSError as e: - git_diff = ' '.join(cmd) + ': ' + e.strerror - - # Store a text file in the log directory - rev_info_filename = os.path.join(output_dir, 'revision_info.txt') - with open(rev_info_filename, "w") as text_file: - text_file.write('arguments: %s\n--------------------\n' % arg_string) - text_file.write('tensorflow version: %s\n--------------------\n' % tf.__version__) # @UndefinedVariable - text_file.write('git hash: %s\n--------------------\n' % git_hash) - text_file.write('%s' % git_diff) - -def list_variables(filename): - reader = training.NewCheckpointReader(filename) - variable_map = reader.get_variable_to_shape_map() - names = sorted(variable_map.keys()) - return names - -def put_images_on_grid(images, shape=(16,8)): - nrof_images = images.shape[0] - img_size = images.shape[1] - bw = 3 - img = np.zeros((shape[1]*(img_size+bw)+bw, shape[0]*(img_size+bw)+bw, 3), np.float32) - for i in range(shape[1]): - x_start = i*(img_size+bw)+bw - for j in range(shape[0]): - img_index = i*shape[0]+j - if img_index>=nrof_images: - break - y_start = j*(img_size+bw)+bw - img[x_start:x_start+img_size, y_start:y_start+img_size, :] = images[img_index, :, :, :] - if img_index>=nrof_images: - break - return img - -def write_arguments_to_file(args, filename): - with open(filename, 'w') as f: - for key, value in iteritems(vars(args)): - f.write('%s: %s\n' % (key, str(value))) \ No newline at end of file diff --git a/src_v2/models/inception_resnet_v1.py b/src_v2/models/inception_resnet_v1.py deleted file mode 100644 index 26fe33315..000000000 --- a/src_v2/models/inception_resnet_v1.py +++ /dev/null @@ -1,63 +0,0 @@ -# Copyright 2016 The TensorFlow Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================== - -"""Contains the definition of the Inception Resnet V1 architecture. -As described in http://arxiv.org/abs/1602.07261. - Inception-v4, Inception-ResNet and the Impact of Residual Connections - on Learning - Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi -""" -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import tensorflow as tf - -class Block35(tf.keras.Model): - def __init__(self): - super(Block35, self).__init__(name='') - # Branch_0 - self.conv1 = tf.keras.layers.Conv2D(32, (1, 1),padding='same') - # Branch_1 - self.conv2a = tf.keras.layers.Conv2D(32, (1, 1),padding='same') - self.conv2b = tf.keras.layers.Conv2D(32, (3, 3),padding='same') - # Branch_2 - self.conv3a = tf.keras.layers.Conv2D(32, (1, 1),padding='same') - self.conv3b = tf.keras.layers.Conv2D(32, (3, 3),padding='same') - self.conv3c = tf.keras.layers.Conv2D(32, (3, 3),padding='same') - # Up - self.convup = tf.keras.layers.Conv2D(32, (1, 1),padding='same') - - - def call(self, input_tensor, scale = 1.0, activation_fn=tf.nn.relu,): - # Branch_0 - x = self.conv1(input_tensor) - # Branch_1 - y_1 = self.conv2a(input_tensor) - y_2 = self.conv2b(y_1) - # Branch_2 - z_1 = self.conv3a(input_tensor) - z_2 = self.conv3b(z_1) - z_3 = self.conv3c(z_2) - - mixed = tf.concat([x, y_2, z_3], 3) - up = tf.keras.layers.Conv2D(input_tensor.get_shape()[3], (1,1))(mixed) - - input_tensor += scale * up - if activation_fn: - input_tensor = activation_fn(input_tensor) - - - return input_tensor \ No newline at end of file diff --git a/src_v2/train_softmax.py b/src_v2/train_softmax.py deleted file mode 100644 index 49b3e6806..000000000 --- a/src_v2/train_softmax.py +++ /dev/null @@ -1,404 +0,0 @@ -"""Training a face recognizer with TensorFlow using softmax cross entropy loss -""" -# MIT License -# -# Copyright (c) 2016 David Sandberg -# -# Permission is hereby granted, free of charge, to any person obtaining a copy -# of this software and associated documentation files (the "Software"), to deal -# in the Software without restriction, including without limitation the rights -# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -# copies of the Software, and to permit persons to whom the Software is -# furnished to do so, subject to the following conditions: -# -# The above copyright notice and this permission notice shall be included in all -# copies or substantial portions of the Software. -# -# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -# SOFTWARE. - -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -from datetime import datetime -import os.path -import time -import sys -import random -import tensorflow as tf -import numpy as np -import importlib -import argparse -import facenet -import lfw -import h5py -import math -from tensorflow.python.ops import data_flow_ops -from tensorflow.python.framework import ops -from tensorflow.python.ops import array_ops - -def main(args): - network = importlib.import_module(args.model_def) - image_size = (args.image_size, args.image_size) - - subdir = datetime.strftime(datetime.now(), '%Y%m%d-%H%M%S') - log_dir = os.path.join(os.path.expanduser(args.logs_base_dir), subdir) - if not os.path.isdir(log_dir): # Create the log directory if it doesn't exist - os.makedirs(log_dir) - model_dir = os.path.join(os.path.expanduser(args.models_base_dir), subdir) - if not os.path.isdir(model_dir): # Create the model directory if it doesn't exist - os.makedirs(model_dir) - - stat_file_name = os.path.join(log_dir, 'stat.h5') - - # Write arguments to a text file - facenet.write_arguments_to_file(args, os.path.join(log_dir, 'arguments.txt')) - - # Store some git revision info in a text file in the log directory - src_path,_ = os.path.split(os.path.realpath(__file__)) - facenet.store_revision_info(src_path, log_dir, ' '.join(sys.argv)) - - np.random.seed(seed=args.seed) - random.seed(args.seed) - dataset = facenet.get_dataset(args.data_dir) - if args.filter_filename: - dataset = filter_dataset(dataset, os.path.expanduser(args.filter_filename), - args.filter_percentile, args.filter_min_nrof_images_per_class) - - if args.validation_set_split_ratio>0.0: - train_set, val_set = facenet.split_dataset(dataset, args.validation_set_split_ratio, args.min_nrof_val_images_per_class, 'SPLIT_IMAGES') - else: - train_set, val_set = dataset, [] - - nrof_classes = len(train_set) - - print('Model directory: %s' % model_dir) - print('Log directory: %s' % log_dir) - - pretrained_model = None - if args.pretrained_model: - pretrained_model = os.path.expanduser(args.pretrained_model) - print('Pre-trained model: %s' % pretrained_model) - - -def find_threshold(var, percentile): - hist, bin_edges = np.histogram(var, 100) - cdf = np.float32(np.cumsum(hist)) / np.sum(hist) - bin_centers = (bin_edges[:-1]+bin_edges[1:])/2 - #plt.plot(bin_centers, cdf) - threshold = np.interp(percentile*0.01, cdf, bin_centers) - return threshold - -def filter_dataset(dataset, data_filename, percentile, min_nrof_images_per_class): - with h5py.File(data_filename,'r') as f: - distance_to_center = np.array(f.get('distance_to_center')) - label_list = np.array(f.get('label_list')) - image_list = np.array(f.get('image_list')) - distance_to_center_threshold = find_threshold(distance_to_center, percentile) - indices = np.where(distance_to_center>=distance_to_center_threshold)[0] - filtered_dataset = dataset - removelist = [] - for i in indices: - label = label_list[i] - image = image_list[i] - if image in filtered_dataset[label].image_paths: - filtered_dataset[label].image_paths.remove(image) - if len(filtered_dataset[label].image_paths)0.0: - lr = args.learning_rate - else: - lr = facenet.get_learning_rate_from_file(learning_rate_schedule_file, epoch) - - if lr<=0: - return False - - index_epoch = sess.run(index_dequeue_op) - label_epoch = np.array(label_list)[index_epoch] - image_epoch = np.array(image_list)[index_epoch] - - # Enqueue one epoch of image paths and labels - labels_array = np.expand_dims(np.array(label_epoch),1) - image_paths_array = np.expand_dims(np.array(image_epoch),1) - control_value = facenet.RANDOM_ROTATE * random_rotate + facenet.RANDOM_CROP * random_crop + facenet.RANDOM_FLIP * random_flip + facenet.FIXED_STANDARDIZATION * use_fixed_image_standardization - control_array = np.ones_like(labels_array) * control_value - sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array}) - - # Training loop - train_time = 0 - while batch_number < args.epoch_size: - start_time = time.time() - feed_dict = {learning_rate_placeholder: lr, phase_train_placeholder:True, batch_size_placeholder:args.batch_size} - tensor_list = [loss, train_op, step, reg_losses, prelogits, cross_entropy_mean, learning_rate, prelogits_norm, accuracy, prelogits_center_loss] - if batch_number % 100 == 0: - loss_, _, step_, reg_losses_, prelogits_, cross_entropy_mean_, lr_, prelogits_norm_, accuracy_, center_loss_, summary_str = sess.run(tensor_list + [summary_op], feed_dict=feed_dict) - summary_writer.add_summary(summary_str, global_step=step_) - else: - loss_, _, step_, reg_losses_, prelogits_, cross_entropy_mean_, lr_, prelogits_norm_, accuracy_, center_loss_ = sess.run(tensor_list, feed_dict=feed_dict) - - duration = time.time() - start_time - stat['loss'][step_-1] = loss_ - stat['center_loss'][step_-1] = center_loss_ - stat['reg_loss'][step_-1] = np.sum(reg_losses_) - stat['xent_loss'][step_-1] = cross_entropy_mean_ - stat['prelogits_norm'][step_-1] = prelogits_norm_ - stat['learning_rate'][epoch-1] = lr_ - stat['accuracy'][step_-1] = accuracy_ - stat['prelogits_hist'][epoch-1,:] += np.histogram(np.minimum(np.abs(prelogits_), prelogits_hist_max), bins=1000, range=(0.0, prelogits_hist_max))[0] - - duration = time.time() - start_time - print('Epoch: [%d][%d/%d]\tTime %.3f\tLoss %2.3f\tXent %2.3f\tRegLoss %2.3f\tAccuracy %2.3f\tLr %2.5f\tCl %2.3f' % - (epoch, batch_number+1, args.epoch_size, duration, loss_, cross_entropy_mean_, np.sum(reg_losses_), accuracy_, lr_, center_loss_)) - batch_number += 1 - train_time += duration - # Add validation loss and accuracy to summary - summary = tf.summary() - #pylint: disable=maybe-no-member - summary.value.add(tag='time/total', simple_value=train_time) - summary_writer.add_summary(summary, global_step=step_) - return True - -def validate(args, sess, epoch, image_list, label_list, enqueue_op, image_paths_placeholder, labels_placeholder, control_placeholder, - phase_train_placeholder, batch_size_placeholder, - stat, loss, regularization_losses, cross_entropy_mean, accuracy, validate_every_n_epochs, use_fixed_image_standardization): - - print('Running forward pass on validation set') - - nrof_batches = len(label_list) // args.lfw_batch_size - nrof_images = nrof_batches * args.lfw_batch_size - - # Enqueue one epoch of image paths and labels - labels_array = np.expand_dims(np.array(label_list[:nrof_images]),1) - image_paths_array = np.expand_dims(np.array(image_list[:nrof_images]),1) - control_array = np.ones_like(labels_array, np.int32)*facenet.FIXED_STANDARDIZATION * use_fixed_image_standardization - sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array}) - - loss_array = np.zeros((nrof_batches,), np.float32) - xent_array = np.zeros((nrof_batches,), np.float32) - accuracy_array = np.zeros((nrof_batches,), np.float32) - - # Training loop - start_time = time.time() - for i in range(nrof_batches): - feed_dict = {phase_train_placeholder:False, batch_size_placeholder:args.lfw_batch_size} - loss_, cross_entropy_mean_, accuracy_ = sess.run([loss, cross_entropy_mean, accuracy], feed_dict=feed_dict) - loss_array[i], xent_array[i], accuracy_array[i] = (loss_, cross_entropy_mean_, accuracy_) - if i % 10 == 9: - print('.', end='') - sys.stdout.flush() - print('') - - duration = time.time() - start_time - - val_index = (epoch-1)//validate_every_n_epochs - stat['val_loss'][val_index] = np.mean(loss_array) - stat['val_xent_loss'][val_index] = np.mean(xent_array) - stat['val_accuracy'][val_index] = np.mean(accuracy_array) - - print('Validation Epoch: %d\tTime %.3f\tLoss %2.3f\tXent %2.3f\tAccuracy %2.3f' % - (epoch, duration, np.mean(loss_array), np.mean(xent_array), np.mean(accuracy_array))) - -def evaluate(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, - embeddings, labels, image_paths, actual_issame, batch_size, nrof_folds, log_dir, step, summary_writer, stat, epoch, distance_metric, subtract_mean, use_flipped_images, use_fixed_image_standardization): - start_time = time.time() - # Run forward pass to calculate embeddings - print('Runnning forward pass on LFW images') - - # Enqueue one epoch of image paths and labels - nrof_embeddings = len(actual_issame)*2 # nrof_pairs * nrof_images_per_pair - nrof_flips = 2 if use_flipped_images else 1 - nrof_images = nrof_embeddings * nrof_flips - labels_array = np.expand_dims(np.arange(0,nrof_images),1) - image_paths_array = np.expand_dims(np.repeat(np.array(image_paths),nrof_flips),1) - control_array = np.zeros_like(labels_array, np.int32) - if use_fixed_image_standardization: - control_array += np.ones_like(labels_array)*facenet.FIXED_STANDARDIZATION - if use_flipped_images: - # Flip every second image - control_array += (labels_array % 2)*facenet.FLIP - sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array}) - - embedding_size = int(embeddings.get_shape()[1]) - assert nrof_images % batch_size == 0, 'The number of LFW images must be an integer multiple of the LFW batch size' - nrof_batches = nrof_images // batch_size - emb_array = np.zeros((nrof_images, embedding_size)) - lab_array = np.zeros((nrof_images,)) - for i in range(nrof_batches): - feed_dict = {phase_train_placeholder:False, batch_size_placeholder:batch_size} - emb, lab = sess.run([embeddings, labels], feed_dict=feed_dict) - lab_array[lab] = lab - emb_array[lab, :] = emb - if i % 10 == 9: - print('.', end='') - sys.stdout.flush() - print('') - embeddings = np.zeros((nrof_embeddings, embedding_size*nrof_flips)) - if use_flipped_images: - # Concatenate embeddings for flipped and non flipped version of the images - embeddings[:,:embedding_size] = emb_array[0::2,:] - embeddings[:,embedding_size:] = emb_array[1::2,:] - else: - embeddings = emb_array - - assert np.array_equal(lab_array, np.arange(nrof_images))==True, 'Wrong labels used for evaluation, possibly caused by training examples left in the input pipeline' - _, _, accuracy, val, val_std, far = lfw.evaluate(embeddings, actual_issame, nrof_folds=nrof_folds, distance_metric=distance_metric, subtract_mean=subtract_mean) - - print('Accuracy: %2.5f+-%2.5f' % (np.mean(accuracy), np.std(accuracy))) - print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far)) - lfw_time = time.time() - start_time - # Add validation loss and accuracy to summary - summary = tf.compat.v1.Summary() - #pylint: disable=maybe-no-member - summary.value.add(tag='lfw/accuracy', simple_value=np.mean(accuracy)) - summary.value.add(tag='lfw/val_rate', simple_value=val) - summary.value.add(tag='time/lfw', simple_value=lfw_time) - summary_writer.add_summary(summary, step) - with open(os.path.join(log_dir,'lfw_result.txt'),'at') as f: - f.write('%d\t%.5f\t%.5f\n' % (step, np.mean(accuracy), val)) - stat['lfw_accuracy'][epoch-1] = np.mean(accuracy) - stat['lfw_valrate'][epoch-1] = val - -def save_variables_and_metagraph(sess, saver, summary_writer, model_dir, model_name, step): - # Save the model checkpoint - print('Saving variables') - start_time = time.time() - checkpoint_path = os.path.join(model_dir, 'model-%s.ckpt' % model_name) - saver.save(sess, checkpoint_path, global_step=step, write_meta_graph=False) - save_time_variables = time.time() - start_time - print('Variables saved in %.2f seconds' % save_time_variables) - metagraph_filename = os.path.join(model_dir, 'model-%s.meta' % model_name) - save_time_metagraph = 0 - if not os.path.exists(metagraph_filename): - print('Saving metagraph') - start_time = time.time() - saver.export_meta_graph(metagraph_filename) - save_time_metagraph = time.time() - start_time - print('Metagraph saved in %.2f seconds' % save_time_metagraph) - summary = tf.compat.v1.Summary() - #pylint: disable=maybe-no-member - summary.value.add(tag='time/save_variables', simple_value=save_time_variables) - summary.value.add(tag='time/save_metagraph', simple_value=save_time_metagraph) - summary_writer.add_summary(summary, step) - - -def parse_arguments(argv): - parser = argparse.ArgumentParser() - - parser.add_argument('--logs_base_dir', type=str, - help='Directory where to write event logs.', default='~/logs/facenet') - parser.add_argument('--models_base_dir', type=str, - help='Directory where to write trained models and checkpoints.', default='~/models/facenet') - parser.add_argument('--gpu_memory_fraction', type=float, - help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0) - parser.add_argument('--pretrained_model', type=str, - help='Load a pretrained model before training starts.') - parser.add_argument('--data_dir', type=str, - help='Path to the data directory containing aligned face patches.', - default='~/datasets/casia/casia_maxpy_mtcnnalign_182_160') - parser.add_argument('--model_def', type=str, - help='Model definition. Points to a module containing the definition of the inference graph.', default='models.inception_resnet_v1') - parser.add_argument('--max_nrof_epochs', type=int, - help='Number of epochs to run.', default=500) - parser.add_argument('--batch_size', type=int, - help='Number of images to process in a batch.', default=90) - parser.add_argument('--image_size', type=int, - help='Image size (height, width) in pixels.', default=160) - parser.add_argument('--epoch_size', type=int, - help='Number of batches per epoch.', default=1000) - parser.add_argument('--embedding_size', type=int, - help='Dimensionality of the embedding.', default=128) - parser.add_argument('--random_crop', - help='Performs random cropping of training images. If false, the center image_size pixels from the training images are used. ' + - 'If the size of the images in the data directory is equal to image_size no cropping is performed', action='store_true') - parser.add_argument('--random_flip', - help='Performs random horizontal flipping of training images.', action='store_true') - parser.add_argument('--random_rotate', - help='Performs random rotations of training images.', action='store_true') - parser.add_argument('--use_fixed_image_standardization', - help='Performs fixed standardization of images.', action='store_true') - parser.add_argument('--keep_probability', type=float, - help='Keep probability of dropout for the fully connected layer(s).', default=1.0) - parser.add_argument('--weight_decay', type=float, - help='L2 weight regularization.', default=0.0) - parser.add_argument('--center_loss_factor', type=float, - help='Center loss factor.', default=0.0) - parser.add_argument('--center_loss_alfa', type=float, - help='Center update rate for center loss.', default=0.95) - parser.add_argument('--prelogits_norm_loss_factor', type=float, - help='Loss based on the norm of the activations in the prelogits layer.', default=0.0) - parser.add_argument('--prelogits_norm_p', type=float, - help='Norm to use for prelogits norm loss.', default=1.0) - parser.add_argument('--prelogits_hist_max', type=float, - help='The max value for the prelogits histogram.', default=10.0) - parser.add_argument('--optimizer', type=str, choices=['ADAGRAD', 'ADADELTA', 'ADAM', 'RMSPROP', 'MOM'], - help='The optimization algorithm to use', default='ADAGRAD') - parser.add_argument('--learning_rate', type=float, - help='Initial learning rate. If set to a negative value a learning rate ' + - 'schedule can be specified in the file "learning_rate_schedule.txt"', default=0.1) - parser.add_argument('--learning_rate_decay_epochs', type=int, - help='Number of epochs between learning rate decay.', default=100) - parser.add_argument('--learning_rate_decay_factor', type=float, - help='Learning rate decay factor.', default=1.0) - parser.add_argument('--moving_average_decay', type=float, - help='Exponential decay for tracking of training parameters.', default=0.9999) - parser.add_argument('--seed', type=int, - help='Random seed.', default=666) - parser.add_argument('--nrof_preprocess_threads', type=int, - help='Number of preprocessing (data loading and augmentation) threads.', default=4) - parser.add_argument('--log_histograms', - help='Enables logging of weight/bias histograms in tensorboard.', action='store_true') - parser.add_argument('--learning_rate_schedule_file', type=str, - help='File containing the learning rate schedule that is used when learning_rate is set to to -1.', default='data/learning_rate_schedule.txt') - parser.add_argument('--filter_filename', type=str, - help='File containing image data used for dataset filtering', default='') - parser.add_argument('--filter_percentile', type=float, - help='Keep only the percentile images closed to its class center', default=100.0) - parser.add_argument('--filter_min_nrof_images_per_class', type=int, - help='Keep only the classes with this number of examples or more', default=0) - parser.add_argument('--validate_every_n_epochs', type=int, - help='Number of epoch between validation', default=5) - parser.add_argument('--validation_set_split_ratio', type=float, - help='The ratio of the total dataset to use for validation', default=0.0) - parser.add_argument('--min_nrof_val_images_per_class', type=float, - help='Classes with fewer images will be removed from the validation set', default=0) - - # Parameters for validation on LFW - parser.add_argument('--lfw_pairs', type=str, - help='The file containing the pairs to use for validation.', default='data/pairs.txt') - parser.add_argument('--lfw_dir', type=str, - help='Path to the data directory containing aligned face patches.', default='') - parser.add_argument('--lfw_batch_size', type=int, - help='Number of images to process in a batch in the LFW test set.', default=100) - parser.add_argument('--lfw_nrof_folds', type=int, - help='Number of folds to use for cross validation. Mainly used for testing.', default=10) - parser.add_argument('--lfw_distance_metric', type=int, - help='Type of distance metric to use. 0: Euclidian, 1:Cosine similarity distance.', default=0) - parser.add_argument('--lfw_use_flipped_images', - help='Concatenates embeddings for the image and its horizontally flipped counterpart.', action='store_true') - parser.add_argument('--lfw_subtract_mean', - help='Subtract feature mean before calculating distance.', action='store_true') - return parser.parse_args(argv) - - -if __name__ == '__main__': - main(parse_arguments(sys.argv[1:])) diff --git a/tmp/align_dataset.m b/tmp/align_dataset.m new file mode 100644 index 000000000..1e1fce089 --- /dev/null +++ b/tmp/align_dataset.m @@ -0,0 +1,178 @@ +# MIT License +# +# Copyright (c) 2016 David Sandberg +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. + +% LFW +% source_path = '/home/david/datasets/lfw/raw'; +% target_path = '/home/david/datasets/lfw/lfw_mtcnnalign_160'; +% image_size = 160 + 0; +% margin = round(image_size*0.2) + 0; + +% FaceScrub +% source_path = '/home/david/datasets/facescrub/facescrub/'; +% target_path = '/home/david/datasets/facescrub/facescrub_mtcnnalign_182_160'; +% failed_images_list = '/home/david/datasets/facescrub/facescrub_mtcnnalign_182_160/failed_images.txt'; +% image_size = 160 + 12; +% margin = round(image_size*0.2) + 12; + +source_path = '/home/david/datasets/casia/CASIA-maxpy-clean/'; +target_path = '/home/david/datasets/casia/casia_maxpy_mtcnnalign_182_160'; +failed_images_list = '/home/david/datasets/casia/casia_maxpy_mtcnnalign_182_160/failed_images.txt'; +image_size = 160 + 12; +margin = round(image_size*0.2) + 12; + +image_extension = 'png'; +minsize=20; %minimum size of face +use_new = 0; + +caffe_path='/home/david/repo2/caffe/matlab'; +pdollar_toolbox_path='/home/david/repo2/toolbox'; +if use_new + caffe_model_path='/home/david/repo2/MTCNN_face_detection_alignment/code/codes/MTCNNv2/model'; +else + caffe_model_path='/home/david/repo2/MTCNN_face_detection_alignment/code/codes/MTCNNv1/model'; +end; +addpath(genpath(caffe_path)); +addpath(genpath(pdollar_toolbox_path)); + +caffe.set_mode_gpu(); +caffe.set_device(0); + +%three steps's threshold +threshold=[0.6 0.7 0.7]; + +%scale factor +factor=0.709; + +%load caffe models +if use_new + prototxt_dir = strcat(caffe_model_path,'/det4.prototxt'); + model_dir = strcat(caffe_model_path,'/det4.caffemodel'); +end; +%faces=cell(0); + +k = 0; +classes = dir(source_path); +%classes = classes(randperm(length(classes))); +for i=1:length(classes), + if classes(i).name(1)~='.' + source_class_path = sprintf('%s/%s', source_path, classes(i).name); + target_class_path = sprintf('%s/%s', target_path, classes(i).name); + imgs = dir(source_class_path); + %imgs = imgs(randperm(length(imgs))); + if ~exist(target_class_path, 'dir'), + mkdir(target_class_path); + end; + for j=1:length(imgs), + if imgs(j).isdir==0 + [pathstr,name,ext] = fileparts(imgs(j).name); + target_img_path = sprintf('%s/%s.%s', target_class_path, name, image_extension); + if ~exist(target_img_path,'file') && any([ strcmpi(ext,'.jpg') strcmpi(ext,'.jpeg') strcmpi(ext,'.png') strcmpi(ext,'.gif') ]) + if mod(k,1000)==0 + fprintf('Resetting GPU\n'); + caffe.reset_all(); + caffe.set_mode_gpu(); + caffe.set_device(0); + prototxt_dir = strcat(caffe_model_path,'/det1.prototxt'); + model_dir = strcat(caffe_model_path,'/det1.caffemodel'); + PNet=caffe.Net(prototxt_dir,model_dir,'test'); + prototxt_dir = strcat(caffe_model_path,'/det2.prototxt'); + model_dir = strcat(caffe_model_path,'/det2.caffemodel'); + RNet=caffe.Net(prototxt_dir,model_dir,'test'); + prototxt_dir = strcat(caffe_model_path,'/det3.prototxt'); + model_dir = strcat(caffe_model_path,'/det3.caffemodel'); + ONet=caffe.Net(prototxt_dir,model_dir,'test'); + if use_new + prototxt_dir = strcat(caffe_model_path,'/det4.prototxt'); + model_dir = strcat(caffe_model_path,'/det4.caffemodel'); + LNet=caffe.Net(prototxt_dir,model_dir,'test'); + end; + end; + + source_img_path = sprintf('%s/%s', source_class_path, imgs(j).name); + % source_img_path = '/home/david/datasets/facescrub/facescrub//Billy_Zane/095f83fefdf1dc493c013edb1ef860001193e8d9.jpg' + try + img = imread(source_img_path); + catch exception + fprintf('Unexpected error (%s): %s\n', exception.identifier, exception.message); + continue; + end; + fprintf('%6d: %s\n', k, source_img_path); + if length(size(img))<3 + img = repmat(img,[1,1,3]); + end; + img_size = size(img); % [height, width, channels] + img_size = fliplr(img_size(1:2)); % [x,y] + if use_new + [boundingboxes, points]=detect_face_v2(img,minsize,PNet,RNet,ONet,LNet,threshold,false,factor); + else + [boundingboxes, points]=detect_face_v1(img,minsize,PNet,RNet,ONet,threshold,false,factor); + end; + nrof_faces = size(boundingboxes,1); + det = boundingboxes; + if nrof_faces>0 + if nrof_faces>1 + % select the faces with the largest bounding box + % closest to the image center + bounding_box_size = (det(:,3)-det(:,1)).*(det(:,4)-det(:,2)); + img_center = img_size / 2; + offsets = [ (det(:,1)+det(:,3))/2 (det(:,2)+det(:,4))/2 ] - ones(nrof_faces,1)*img_center; + offset_dist_squared = sum(offsets.^2,2); + [a, index] = max(bounding_box_size-offset_dist_squared*2.0); % some extra weight on the centering + det = det(index,:); + points = points(:,index); + end; +% if nrof_faces>0 +% figure(1); clf; +% imshow(img); +% hold on; +% plot(points(1:5,1),points(6:10,1),'g.','MarkerSize',10); +% bb = round(det(1:4)); +% rectangle('Position',[bb(1) bb(2) bb(3)-bb(1) bb(4)-bb(2)],'LineWidth',2,'LineStyle','-') +% xxx = 1; +% end; + det(1) = max(det(1)-margin/2, 1); + det(2) = max(det(2)-margin/2, 1); + det(3) = min(det(3)+margin/2, img_size(1)); + det(4) = min(det(4)+margin/2, img_size(2)); + det(1:4) = round(det(1:4)); + + img = img(det(2):det(4),det(1):det(3),:); + img = imresize(img, [image_size, image_size]); + + imwrite(img, target_img_path); + k = k + 1; + else + fprintf('Detection failed: %s\n', source_img_path); + fid = fopen(failed_images_list,'at'); + if fid>=0 + fprintf(fid, '%s\n', source_img_path); + fclose(fid); + end; + end; + if mod(k,100)==0 + xxx = 1; + end; + end; + end; + end; + end; +end; diff --git a/tmp/detect_face_v1.m b/tmp/detect_face_v1.m new file mode 100644 index 000000000..4aeb66239 --- /dev/null +++ b/tmp/detect_face_v1.m @@ -0,0 +1,253 @@ +% MIT License +% +% Copyright (c) 2016 Kaipeng Zhang +% +% Permission is hereby granted, free of charge, to any person obtaining a copy +% of this software and associated documentation files (the "Software"), to deal +% in the Software without restriction, including without limitation the rights +% to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +% copies of the Software, and to permit persons to whom the Software is +% furnished to do so, subject to the following conditions: +% +% The above copyright notice and this permission notice shall be included in all +% copies or substantial portions of the Software. +% +% THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +% IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +% FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +% AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +% LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +% OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +% SOFTWARE. + +function [total_boxes, points] = detect_face_v1(img,minsize,PNet,RNet,ONet,threshold,fastresize,factor) + %im: input image + %minsize: minimum of faces' size + %pnet, rnet, onet: caffemodel + %threshold: threshold=[th1 th2 th3], th1-3 are three steps's threshold + %fastresize: resize img from last scale (using in high-resolution images) if fastresize==true + factor_count=0; + total_boxes=[]; + points=[]; + h=size(img,1); + w=size(img,2); + minl=min([w h]); + img=single(img); + if fastresize + im_data=(single(img)-127.5)*0.0078125; + end + m=12/minsize; + minl=minl*m; + %creat scale pyramid + scales=[]; + while (minl>=12) + scales=[scales m*factor^(factor_count)]; + minl=minl*factor; + factor_count=factor_count+1; + end + %first stage + for j = 1:size(scales,2) + scale=scales(j); + hs=ceil(h*scale); + ws=ceil(w*scale); + if fastresize + im_data=imResample(im_data,[hs ws],'bilinear'); + else + im_data=(imResample(img,[hs ws],'bilinear')-127.5)*0.0078125; + end + PNet.blobs('data').reshape([hs ws 3 1]); + out=PNet.forward({im_data}); + boxes=generateBoundingBox(out{2}(:,:,2),out{1},scale,threshold(1)); + %inter-scale nms + pick=nms(boxes,0.5,'Union'); + boxes=boxes(pick,:); + if ~isempty(boxes) + total_boxes=[total_boxes;boxes]; + end + end + numbox=size(total_boxes,1); + if ~isempty(total_boxes) + pick=nms(total_boxes,0.7,'Union'); + total_boxes=total_boxes(pick,:); + regw=total_boxes(:,3)-total_boxes(:,1); + regh=total_boxes(:,4)-total_boxes(:,2); + total_boxes=[total_boxes(:,1)+total_boxes(:,6).*regw total_boxes(:,2)+total_boxes(:,7).*regh total_boxes(:,3)+total_boxes(:,8).*regw total_boxes(:,4)+total_boxes(:,9).*regh total_boxes(:,5)]; + total_boxes=rerec(total_boxes); + total_boxes(:,1:4)=fix(total_boxes(:,1:4)); + [dy edy dx edx y ey x ex tmpw tmph]=pad(total_boxes,w,h); + end + numbox=size(total_boxes,1); + if numbox>0 + %second stage + tempimg=zeros(24,24,3,numbox); + for k=1:numbox + tmp=zeros(tmph(k),tmpw(k),3); + tmp(dy(k):edy(k),dx(k):edx(k),:)=img(y(k):ey(k),x(k):ex(k),:); + if size(tmp,1)>0 && size(tmp,2)>0 || size(tmp,1)==0 && size(tmp,2)==0 + tempimg(:,:,:,k)=imResample(tmp,[24 24],'bilinear'); + else + total_boxes = []; + return; + end; + end + tempimg=(tempimg-127.5)*0.0078125; + RNet.blobs('data').reshape([24 24 3 numbox]); + out=RNet.forward({tempimg}); + score=squeeze(out{2}(2,:)); + pass=find(score>threshold(2)); + total_boxes=[total_boxes(pass,1:4) score(pass)']; + mv=out{1}(:,pass); + if size(total_boxes,1)>0 + pick=nms(total_boxes,0.7,'Union'); + total_boxes=total_boxes(pick,:); + total_boxes=bbreg(total_boxes,mv(:,pick)'); + total_boxes=rerec(total_boxes); + end + numbox=size(total_boxes,1); + if numbox>0 + %third stage + total_boxes=fix(total_boxes); + [dy edy dx edx y ey x ex tmpw tmph]=pad(total_boxes,w,h); + tempimg=zeros(48,48,3,numbox); + for k=1:numbox + tmp=zeros(tmph(k),tmpw(k),3); + tmp(dy(k):edy(k),dx(k):edx(k),:)=img(y(k):ey(k),x(k):ex(k),:); + if size(tmp,1)>0 && size(tmp,2)>0 || size(tmp,1)==0 && size(tmp,2)==0 + tempimg(:,:,:,k)=imResample(tmp,[48 48],'bilinear'); + else + total_boxes = []; + return; + end; + end + tempimg=(tempimg-127.5)*0.0078125; + ONet.blobs('data').reshape([48 48 3 numbox]); + out=ONet.forward({tempimg}); + score=squeeze(out{3}(2,:)); + points=out{2}; + pass=find(score>threshold(3)); + points=points(:,pass); + total_boxes=[total_boxes(pass,1:4) score(pass)']; + mv=out{1}(:,pass); + w=total_boxes(:,3)-total_boxes(:,1)+1; + h=total_boxes(:,4)-total_boxes(:,2)+1; + points(1:5,:)=repmat(w',[5 1]).*points(1:5,:)+repmat(total_boxes(:,1)',[5 1])-1; + points(6:10,:)=repmat(h',[5 1]).*points(6:10,:)+repmat(total_boxes(:,2)',[5 1])-1; + if size(total_boxes,1)>0 + total_boxes=bbreg(total_boxes,mv(:,:)'); + pick=nms(total_boxes,0.7,'Min'); + total_boxes=total_boxes(pick,:); + points=points(:,pick); + end + end + end +end + +function [boundingbox] = bbreg(boundingbox,reg) + %calibrate bouding boxes + if size(reg,2)==1 + reg=reshape(reg,[size(reg,3) size(reg,4)])'; + end + w=[boundingbox(:,3)-boundingbox(:,1)]+1; + h=[boundingbox(:,4)-boundingbox(:,2)]+1; + boundingbox(:,1:4)=[boundingbox(:,1)+reg(:,1).*w boundingbox(:,2)+reg(:,2).*h boundingbox(:,3)+reg(:,3).*w boundingbox(:,4)+reg(:,4).*h]; +end + +function [boundingbox reg] = generateBoundingBox(map,reg,scale,t) + %use heatmap to generate bounding boxes + stride=2; + cellsize=12; + boundingbox=[]; + map=map'; + dx1=reg(:,:,1)'; + dy1=reg(:,:,2)'; + dx2=reg(:,:,3)'; + dy2=reg(:,:,4)'; + [y x]=find(map>=t); + a=find(map>=t); + if size(y,1)==1 + y=y';x=x';score=map(a)';dx1=dx1';dy1=dy1';dx2=dx2';dy2=dy2'; + else + score=map(a); + end + reg=[dx1(a) dy1(a) dx2(a) dy2(a)]; + if isempty(reg) + reg=reshape([],[0 3]); + end + boundingbox=[y x]; + boundingbox=[fix((stride*(boundingbox-1)+1)/scale) fix((stride*(boundingbox-1)+cellsize-1+1)/scale) score reg]; +end + +function pick = nms(boxes,threshold,type) + %NMS + if isempty(boxes) + pick = []; + return; + end + x1 = boxes(:,1); + y1 = boxes(:,2); + x2 = boxes(:,3); + y2 = boxes(:,4); + s = boxes(:,5); + area = (x2-x1+1) .* (y2-y1+1); + [vals, I] = sort(s); + pick = s*0; + counter = 1; + while ~isempty(I) + last = length(I); + i = I(last); + pick(counter) = i; + counter = counter + 1; + xx1 = max(x1(i), x1(I(1:last-1))); + yy1 = max(y1(i), y1(I(1:last-1))); + xx2 = min(x2(i), x2(I(1:last-1))); + yy2 = min(y2(i), y2(I(1:last-1))); + w = max(0.0, xx2-xx1+1); + h = max(0.0, yy2-yy1+1); + inter = w.*h; + if strcmp(type,'Min') + o = inter ./ min(area(i),area(I(1:last-1))); + else + o = inter ./ (area(i) + area(I(1:last-1)) - inter); + end + I = I(find(o<=threshold)); + end + pick = pick(1:(counter-1)); +end + +function [dy edy dx edx y ey x ex tmpw tmph] = pad(total_boxes,w,h) + %compute the padding coordinates (pad the bounding boxes to square) + tmpw=total_boxes(:,3)-total_boxes(:,1)+1; + tmph=total_boxes(:,4)-total_boxes(:,2)+1; + numbox=size(total_boxes,1); + + dx=ones(numbox,1);dy=ones(numbox,1); + edx=tmpw;edy=tmph; + + x=total_boxes(:,1);y=total_boxes(:,2); + ex=total_boxes(:,3);ey=total_boxes(:,4); + + tmp=find(ex>w); + edx(tmp)=-ex(tmp)+w+tmpw(tmp);ex(tmp)=w; + + tmp=find(ey>h); + edy(tmp)=-ey(tmp)+h+tmph(tmp);ey(tmp)=h; + + tmp=find(x<1); + dx(tmp)=2-x(tmp);x(tmp)=1; + + tmp=find(y<1); + dy(tmp)=2-y(tmp);y(tmp)=1; +end + +function [bboxA] = rerec(bboxA) + %convert bboxA to square + bboxB=bboxA(:,1:4); + h=bboxA(:,4)-bboxA(:,2); + w=bboxA(:,3)-bboxA(:,1); + l=max([w h]')'; + bboxA(:,1)=bboxA(:,1)+w.*0.5-l.*0.5; + bboxA(:,2)=bboxA(:,2)+h.*0.5-l.*0.5; + bboxA(:,3:4)=bboxA(:,1:2)+repmat(l,[1 2]); +end + + diff --git a/tmp/detect_face_v2.m b/tmp/detect_face_v2.m new file mode 100644 index 000000000..3ed07b17e --- /dev/null +++ b/tmp/detect_face_v2.m @@ -0,0 +1,288 @@ +% MIT License +% +% Copyright (c) 2016 Kaipeng Zhang +% +% Permission is hereby granted, free of charge, to any person obtaining a copy +% of this software and associated documentation files (the "Software"), to deal +% in the Software without restriction, including without limitation the rights +% to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +% copies of the Software, and to permit persons to whom the Software is +% furnished to do so, subject to the following conditions: +% +% The above copyright notice and this permission notice shall be included in all +% copies or substantial portions of the Software. +% +% THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +% IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +% FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +% AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +% LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +% OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +% SOFTWARE. + +function [total_boxes, points] = detect_face_v2(img,minsize,PNet,RNet,ONet,LNet,threshold,fastresize,factor) + %im: input image + %minsize: minimum of faces' size + %pnet, rnet, onet: caffemodel + %threshold: threshold=[th1 th2 th3], th1-3 are three steps's threshold + %fastresize: resize img from last scale (using in high-resolution images) if fastresize==true + factor_count=0; + total_boxes=[]; + points=[]; + h=size(img,1); + w=size(img,2); + minl=min([w h]); + img=single(img); + if fastresize + im_data=(single(img)-127.5)*0.0078125; + end + m=12/minsize; + minl=minl*m; + %creat scale pyramid + scales=[]; + while (minl>=12) + scales=[scales m*factor^(factor_count)]; + minl=minl*factor; + factor_count=factor_count+1; + end + %first stage + for j = 1:size(scales,2) + scale=scales(j); + hs=ceil(h*scale); + ws=ceil(w*scale); + if fastresize + im_data=imResample(im_data,[hs ws],'bilinear'); + else + im_data=(imResample(img,[hs ws],'bilinear')-127.5)*0.0078125; + end + PNet.blobs('data').reshape([hs ws 3 1]); + out=PNet.forward({im_data}); + boxes=generateBoundingBox(out{2}(:,:,2),out{1},scale,threshold(1)); + %inter-scale nms + pick=nms(boxes,0.5,'Union'); + boxes=boxes(pick,:); + if ~isempty(boxes) + total_boxes=[total_boxes;boxes]; + end + end + numbox=size(total_boxes,1); + if ~isempty(total_boxes) + pick=nms(total_boxes,0.7,'Union'); + total_boxes=total_boxes(pick,:); + bbw=total_boxes(:,3)-total_boxes(:,1); + bbh=total_boxes(:,4)-total_boxes(:,2); + total_boxes=[total_boxes(:,1)+total_boxes(:,6).*bbw total_boxes(:,2)+total_boxes(:,7).*bbh total_boxes(:,3)+total_boxes(:,8).*bbw total_boxes(:,4)+total_boxes(:,9).*bbh total_boxes(:,5)]; + total_boxes=rerec(total_boxes); + total_boxes(:,1:4)=fix(total_boxes(:,1:4)); + [dy edy dx edx y ey x ex tmpw tmph]=pad(total_boxes,w,h); + end + numbox=size(total_boxes,1); + if numbox>0 + %second stage + tempimg=zeros(24,24,3,numbox); + for k=1:numbox + tmp=zeros(tmph(k),tmpw(k),3); + tmp(dy(k):edy(k),dx(k):edx(k),:)=img(y(k):ey(k),x(k):ex(k),:); + tempimg(:,:,:,k)=imResample(tmp,[24 24],'bilinear'); + end + tempimg=(tempimg-127.5)*0.0078125; + RNet.blobs('data').reshape([24 24 3 numbox]); + out=RNet.forward({tempimg}); + score=squeeze(out{2}(2,:)); + pass=find(score>threshold(2)); + total_boxes=[total_boxes(pass,1:4) score(pass)']; + mv=out{1}(:,pass); + if size(total_boxes,1)>0 + pick=nms(total_boxes,0.7,'Union'); + total_boxes=total_boxes(pick,:); + total_boxes=bbreg(total_boxes,mv(:,pick)'); + total_boxes=rerec(total_boxes); + end + numbox=size(total_boxes,1); + if numbox>0 + %third stage + total_boxes=fix(total_boxes); + [dy edy dx edx y ey x ex tmpw tmph]=pad(total_boxes,w,h); + tempimg=zeros(48,48,3,numbox); + for k=1:numbox + tmp=zeros(tmph(k),tmpw(k),3); + tmp(dy(k):edy(k),dx(k):edx(k),:)=img(y(k):ey(k),x(k):ex(k),:); + tempimg(:,:,:,k)=imResample(tmp,[48 48],'bilinear'); + end + tempimg=(tempimg-127.5)*0.0078125; + ONet.blobs('data').reshape([48 48 3 numbox]); + out=ONet.forward({tempimg}); + score=squeeze(out{3}(2,:)); + points=out{2}; + pass=find(score>threshold(3)); + points=points(:,pass); + total_boxes=[total_boxes(pass,1:4) score(pass)']; + mv=out{1}(:,pass); + bbw=total_boxes(:,3)-total_boxes(:,1)+1; + bbh=total_boxes(:,4)-total_boxes(:,2)+1; + points(1:5,:)=repmat(bbw',[5 1]).*points(1:5,:)+repmat(total_boxes(:,1)',[5 1])-1; + points(6:10,:)=repmat(bbh',[5 1]).*points(6:10,:)+repmat(total_boxes(:,2)',[5 1])-1; + if size(total_boxes,1)>0 + total_boxes=bbreg(total_boxes,mv(:,:)'); + pick=nms(total_boxes,0.7,'Min'); + total_boxes=total_boxes(pick,:); + points=points(:,pick); + end + end + numbox=size(total_boxes,1); + %extended stage + if numbox>0 + tempimg=zeros(24,24,15,numbox); + patchw=max([total_boxes(:,3)-total_boxes(:,1)+1 total_boxes(:,4)-total_boxes(:,2)+1]'); + patchw=fix(0.25*patchw); + tmp=find(mod(patchw,2)==1); + patchw(tmp)=patchw(tmp)+1; + pointx=ones(numbox,5); + pointy=ones(numbox,5); + for k=1:5 + tmp=[points(k,:);points(k+5,:)]; + x=fix(tmp(1,:)-0.5*patchw); + y=fix(tmp(2,:)-0.5*patchw); + [dy edy dx edx y ey x ex tmpw tmph]=pad([x' y' x'+patchw' y'+patchw'],w,h); + for j=1:numbox + tmpim=zeros(tmpw(j),tmpw(j),3); + tmpim(dy(j):edy(j),dx(j):edx(j),:)=img(y(j):ey(j),x(j):ex(j),:); + tempimg(:,:,(k-1)*3+1:(k-1)*3+3,j)=imResample(tmpim,[24 24],'bilinear'); + end + end + LNet.blobs('data').reshape([24 24 15 numbox]); + tempimg=(tempimg-127.5)*0.0078125; + out=LNet.forward({tempimg}); + score=squeeze(out{3}(2,:)); + for k=1:5 + tmp=[points(k,:);points(k+5,:)]; + %do not make a large movement + temp=find(abs(out{k}(1,:)-0.5)>0.35); + if ~isempty(temp) + l=length(temp); + out{k}(:,temp)=ones(2,l)*0.5; + end + temp=find(abs(out{k}(2,:)-0.5)>0.35); + if ~isempty(temp) + l=length(temp); + out{k}(:,temp)=ones(2,l)*0.5; + end + pointx(:,k)=(tmp(1,:)-0.5*patchw+out{k}(1,:).*patchw)'; + pointy(:,k)=(tmp(2,:)-0.5*patchw+out{k}(2,:).*patchw)'; + end + for j=1:numbox + points(:,j)=[pointx(j,:)';pointy(j,:)']; + end + end + end +end + +function [boundingbox] = bbreg(boundingbox,reg) + %calibrate bouding boxes + if size(reg,2)==1 + reg=reshape(reg,[size(reg,3) size(reg,4)])'; + end + w=[boundingbox(:,3)-boundingbox(:,1)]+1; + h=[boundingbox(:,4)-boundingbox(:,2)]+1; + boundingbox(:,1:4)=[boundingbox(:,1)+reg(:,1).*w boundingbox(:,2)+reg(:,2).*h boundingbox(:,3)+reg(:,3).*w boundingbox(:,4)+reg(:,4).*h]; +end + +function [boundingbox reg] = generateBoundingBox(map,reg,scale,t) + %use heatmap to generate bounding boxes + stride=2; + cellsize=12; + boundingbox=[]; + map=map'; + dx1=reg(:,:,1)'; + dy1=reg(:,:,2)'; + dx2=reg(:,:,3)'; + dy2=reg(:,:,4)'; + [y x]=find(map>=t); + a=find(map>=t); + if size(y,1)==1 + y=y';x=x';score=map(a)';dx1=dx1';dy1=dy1';dx2=dx2';dy2=dy2'; + else + score=map(a); + end + reg=[dx1(a) dy1(a) dx2(a) dy2(a)]; + if isempty(reg) + reg=reshape([],[0 3]); + end + boundingbox=[y x]; + boundingbox=[fix((stride*(boundingbox-1)+1)/scale) fix((stride*(boundingbox-1)+cellsize-1+1)/scale) score reg]; +end + +function pick = nms(boxes,threshold,type) + %NMS + if isempty(boxes) + pick = []; + return; + end + x1 = boxes(:,1); + y1 = boxes(:,2); + x2 = boxes(:,3); + y2 = boxes(:,4); + s = boxes(:,5); + area = (x2-x1+1) .* (y2-y1+1); + [vals, I] = sort(s); + pick = s*0; + counter = 1; + while ~isempty(I) + last = length(I); + i = I(last); + pick(counter) = i; + counter = counter + 1; + xx1 = max(x1(i), x1(I(1:last-1))); + yy1 = max(y1(i), y1(I(1:last-1))); + xx2 = min(x2(i), x2(I(1:last-1))); + yy2 = min(y2(i), y2(I(1:last-1))); + w = max(0.0, xx2-xx1+1); + h = max(0.0, yy2-yy1+1); + inter = w.*h; + if strcmp(type,'Min') + o = inter ./ min(area(i),area(I(1:last-1))); + else + o = inter ./ (area(i) + area(I(1:last-1)) - inter); + end + I = I(find(o<=threshold)); + end + pick = pick(1:(counter-1)); +end + +function [dy edy dx edx y ey x ex tmpw tmph] = pad(total_boxes,w,h) + %compute the padding coordinates (pad the bounding boxes to square) + tmpw=total_boxes(:,3)-total_boxes(:,1)+1; + tmph=total_boxes(:,4)-total_boxes(:,2)+1; + numbox=size(total_boxes,1); + + dx=ones(numbox,1);dy=ones(numbox,1); + edx=tmpw;edy=tmph; + + x=total_boxes(:,1);y=total_boxes(:,2); + ex=total_boxes(:,3);ey=total_boxes(:,4); + + tmp=find(ex>w); + edx(tmp)=-ex(tmp)+w+tmpw(tmp);ex(tmp)=w; + + tmp=find(ey>h); + edy(tmp)=-ey(tmp)+h+tmph(tmp);ey(tmp)=h; + + tmp=find(x<1); + dx(tmp)=2-x(tmp);x(tmp)=1; + + tmp=find(y<1); + dy(tmp)=2-y(tmp);y(tmp)=1; +end + +function [bboxA] = rerec(bboxA) + %convert bboxA to square + bboxB=bboxA(:,1:4); + h=bboxA(:,4)-bboxA(:,2); + w=bboxA(:,3)-bboxA(:,1); + l=max([w h]')'; + bboxA(:,1)=bboxA(:,1)+w.*0.5-l.*0.5; + bboxA(:,2)=bboxA(:,2)+h.*0.5-l.*0.5; + bboxA(:,3:4)=bboxA(:,1:2)+repmat(l,[1 2]); +end + + diff --git a/util/plot_learning_curves.m b/util/plot_learning_curves.m new file mode 100644 index 000000000..c0f24a344 --- /dev/null +++ b/util/plot_learning_curves.m @@ -0,0 +1,300 @@ +% Plots the lerning curves for the specified training runs from data in the +% file "lfw_result.txt" stored in the log directory for the respective +% model. + +% MIT License +% +% Copyright (c) 2016 David Sandberg +% +% Permission is hereby granted, free of charge, to any person obtaining a copy +% of this software and associated documentation files (the "Software"), to deal +% in the Software without restriction, including without limitation the rights +% to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +% copies of the Software, and to permit persons to whom the Software is +% furnished to do so, subject to the following conditions: +% +% The above copyright notice and this permission notice shall be included in all +% copies or substantial portions of the Software. +% +% THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +% IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +% FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +% AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +% LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +% OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +% SOFTWARE. + +%% +addpath('/home/david/git/facenet/util/'); +log_dirs = { '/home/david/logs/facenet' }; +%% +res = { ... +{ '20180402-114759', 'vggface2, wd=5e-4, center crop, fixed image standardization' }, ... +}; + +%% +res = { ... +{ '20180408-102900', 'casia, wd=5e-4, pnlf=5e-4, fixed image standardization' }, ... +}; + +%% + +colors = {'b', 'g', 'r', 'c', 'm', 'y', 'k'}; +markers = {'.', 'o', 'x', '+', '*', 's', 'd' }; +lines = {'-', '-.', '--', ':' }; +fontSize = 6; +lineWidth = 2; +lineStyles = combineStyles(colors, markers); +lineStyles2 = combineStyles(colors, {''}, lines); +legends = cell(length(res),1); +legends_accuracy = cell(length(res),1); +legends_valrate = cell(length(res),1); +var = cell(length(res),1); +for i=1:length(res), + for k=1:length(log_dirs) + if exist(fullfile(log_dirs{k}, res{i}{1}), 'dir') + ld = log_dirs{k}; + end + end + filename = fullfile(ld, res{i}{1}, 'stat.h5'); + + var{i} = readlogs(filename,{'loss', 'reg_loss', 'xent_loss', 'lfw_accuracy', ... + 'lfw_valrate', 'val_loss', 'val_xent_loss', 'val_accuracy', ... + 'accuracy', 'prelogits_norm', 'learning_rate', 'center_loss', ... + 'prelogits_hist', 'accuracy'}); + var{i}.steps = 1:length(var{i}.loss); + epoch = find(var{i}.lfw_accuracy,1,'last'); + var{i}.epochs = 1:epoch; + legends{i} = sprintf('%s: %s', res{i}{1}, res{i}{2}); + start_epoch = max(1,epoch-10); + legends_accuracy{i} = sprintf('%s: %s (%.2f%%)', res{i}{1}, res{i}{2}, mean(var{i}.lfw_accuracy(start_epoch:epoch))*100 ); + legends_valrate{i} = sprintf('%s: %s (%.2f%%)', res{i}{1}, res{i}{2}, mean(var{i}.lfw_valrate(start_epoch:epoch))*100 ); + + arguments_filename = fullfile(ld, res{i}{1}, 'arguments.txt'); + if exist(arguments_filename) + str = fileread(arguments_filename); + var{i}.wd = getParameter(str, 'weight_decay', '0.0'); + var{i}.cl = getParameter(str, 'center_loss_factor', '0.0'); + var{i}.fixed_std = getParameter(str, 'use_fixed_image_standardization', '0'); + var{i}.data_dir = getParameter(str, 'data_dir', ''); + var{i}.lr = getParameter(str, 'learning_rate', '0.1'); + var{i}.epoch_size = str2double(getParameter(str, 'epoch_size', '1000')); + var{i}.batch_size = str2double(getParameter(str, 'batch_size', '90')); + var{i}.examples_per_epoch = var{i}.epoch_size*var{i}.batch_size; + var{i}.mnipc = getParameter(str, 'filter_min_nrof_images_per_class', '-1'); + var{i}.val_step = str2num(getParameter(str, 'validate_every_n_epochs', '10')); + var{i}.pnlf = getParameter(str, 'prelogits_norm_loss_factor', '-1'); + var{i}.emb_size = getParameter(str, 'embedding_size', '-1'); + + fprintf('%s: wd=%s lr=%s, pnlf=%s, data_dir=%s, emb_size=%s\n', ... + res{i}{1}, var{i}.wd, var{i}.lr, var{i}.pnlf, var{i}.data_dir, var{i}.emb_size); + end +end; + +timestr = datestr(now,'yyyymmdd_HHMMSS'); + +h = 1; figure(h); close(h); figure(h); hold on; setsize(1.5); +title('LFW accuracy'); +xlabel('Steps'); +ylabel('Accuracy'); +grid on; +N = 1; flt = ones(1,N)/N; +for i=1:length(var), + plot(var{i}.epochs*1000, filter(flt, 1, var{i}.lfw_accuracy(var{i}.epochs)), lineStyles2{i}, 'LineWidth', lineWidth); +end; +legend(legends_accuracy,'Location','SouthEast','FontSize',fontSize); +v=axis; +v(3:4) = [ 0.95 1.0 ]; +axis(v); +accuracy_file_name = sprintf('lfw_accuracy_%s',timestr); +%print(accuracy_file_name,'-dpng') + + +if 0 + %% + %h = 2; figure(h); close(h); figure(h); hold on; setsize(1.5); + h = 1; figure(h); hold on; + title('LFW validation rate'); + xlabel('Step'); + ylabel('VAL @ FAR = 10^{-3}'); + grid on; + for i=1:length(var), + plot(var{i}.epochs*1000, var{i}.lfw_valrate(var{i}.epochs), lineStyles{i}, 'LineWidth', lineWidth); + end; + legend(legends_valrate,'Location','SouthEast','FontSize',fontSize); + v=axis; + v(3:4) = [ 0.5 1.0 ]; + axis(v); + valrate_file_name = sprintf('lfw_valrate_%s',timestr); +% print(valrate_file_name,'-dpng') +end + +if 0 + %% Plot cross-entropy loss + h = 3; figure(h); close(h); figure(h); hold on; setsize(1.5); + title('Training/validation set cross-entropy loss'); + xlabel('Step'); + title('Training/validation set cross-entropy loss'); + grid on; + N = 500; flt = ones(1,N)/N; + for i=1:length(var), + var{i}.xent_loss(var{i}.xent_loss==0) = NaN; + plot(var{i}.steps, filter(flt, 1, var{i}.xent_loss), lineStyles2{i}, 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'NorthEast','FontSize',fontSize); + + % Plot cross-entropy loss on validation set + N = 1; flt = ones(1,N)/N; + for i=1:length(var), + v = var{i}.val_xent_loss; + val_steps = (1:length(v))*var{i}.val_step*1000; + v(v==0) = NaN; + plot(val_steps, filter(flt, 1, v), [ lineStyles2{i} '.' ], 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'NorthEast','FontSize',fontSize); + hold off + xent_file_name = sprintf('xent_%s',timestr); + %print(xent_file_name,'-dpng') +end + +if 0 + %% Plot accuracy on training set + h = 32; figure(h); clf; hold on; + title('Training/validation set accuracy'); + xlabel('Step'); + ylabel('Training/validation set accuracy'); + grid on; + N = 500; flt = ones(1,N)/N; + for i=1:length(var), + var{i}.accuracy(var{i}.accuracy==0) = NaN; + plot(var{i}.steps*1000, filter(flt, 1, var{i}.accuracy), lineStyles2{i}, 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'SouthEast','FontSize',fontSize); + + grid on; + N = 1; flt = ones(1,N)/N; + for i=1:length(var), + v = var{i}.val_accuracy; + val_steps = (1:length(v))*var{i}.val_step*1000; + v(v==0) = NaN; + plot(val_steps*1000, filter(flt, 1, v), [ lineStyles2{i} '.' ], 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'SouthEast','FontSize',fontSize); + hold off + acc_file_name = sprintf('accuracy_%s',timestr); + %print(acc_file_name,'-dpng') +end + +if 0 + %% Plot prelogits CDF + h = 35; figure(h); clf; hold on; + title('Prelogits histogram'); + xlabel('Epoch'); + ylabel('Prelogits histogram'); + grid on; + N = 1; flt = ones(1,N)/N; + for i=1:length(var), + epoch = var{i}.epochs(end); + q = cumsum(var{i}.prelogits_hist(:,epoch)); + q2 = q / q(end); + plot(linspace(0,10,1000), q2, lineStyles2{i}, 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'SouthEast','FontSize',fontSize); + hold off +end + +if 0 + %% Plot prelogits norm + h = 32; figure(h); clf; hold on; + title('Prelogits norm'); + xlabel('Step'); + ylabel('Prelogits norm'); + grid on; + N = 1; flt = ones(1,N)/N; + for i=1:length(var), + plot(var{i}.steps, filter(flt, 1, var{i}.prelogits_norm), lineStyles2{i}, 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'NorthEast','FontSize',fontSize); + hold off +end + +if 0 + %% Plot learning rate + h = 42; figure(h); clf; hold on; + title('Learning rate'); + xlabel('Step'); + ylabel('Learning rate'); + grid on; + N = 1; flt = ones(1,N)/N; + for i=1:length(var), + semilogy(var{i}.epochs, filter(flt, 1, var{i}.learning_rate(var{i}.epochs)), lineStyles2{i}, 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'NorthEast','FontSize',fontSize); + hold off +end + +if 0 + %% Plot center loss + h = 9; figure(h); close(h); figure(h); hold on; setsize(1.5); + title('Center loss'); + xlabel('Epochs'); + ylabel('Center loss'); + grid on; + N = 500; flt = ones(1,N)/N; + for i=1:length(var), + if isempty(var{i}.center_loss) + var{i}.center_loss = ones(size(var{i}.steps))*NaN; + end; + var{i}.center_loss(var{i}.center_loss==0) = NaN; + plot(var{i}.steps/var{i}.epoch_size, filter(flt, 1, var{i}.center_loss), lineStyles2{i}, 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'NorthEast','FontSize',fontSize); +end + +if 0 + %% Plot center loss with factor + h = 9; figure(h); close(h); figure(h); hold on; setsize(1.5); + title('Center loss with factor'); + xlabel('Epochs'); + ylabel('Center loss * center loss factor'); + grid on; + N = 500; flt = ones(1,N)/N; + for i=1:length(var), + if isempty(var{i}.center_loss) + var{i}.center_loss = ones(size(var{i}.steps))*NaN; + end; + var{i}.center_loss(var{i}.center_loss==0) = NaN; + plot(var{i}.steps/var{i}.epoch_size, filter(flt, 1, var{i}.center_loss*str2num(var{i}.cl)), lineStyles2{i}, 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'NorthEast','FontSize',fontSize); +end + +if 0 + %% Plot total loss + h = 4; figure(h); close(h); figure(h); hold on; setsize(1.5); + title('Total loss'); + xlabel('Epochs'); + ylabel('Total loss'); + grid on; + N = 500; flt = ones(1,N)/N; + for i=1:length(var), + var{i}.loss(var{i}.loss==0) = NaN; + plot(var{i}.steps/var{i}.epoch_size, filter(flt, 1, var{i}.loss), lineStyles2{i}, 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'NorthEast','FontSize',fontSize); +end + +if 0 + %% Plot regularization loss + h = 5; figure(h); close(h); figure(h); hold on; setsize(1.5); + title('Regularization loss'); + xlabel('Epochs'); + ylabel('Regularization loss'); + grid on; + N = 500; flt = ones(1,N)/N; + for i=1:length(var), + var{i}.reg_loss(var{i}.reg_loss==0) = NaN; + plot(var{i}.steps/var{i}.epoch_size, filter(flt, 1, var{i}.reg_loss), lineStyles2{i}, 'LineWidth', lineWidth); + end; + legend(legends, 'Location', 'NorthEast','FontSize',fontSize); +end From d4861d57dc4a0a759d4c3ad51b701db102c74a4d Mon Sep 17 00:00:00 2001 From: Sarattha K-Main PC Date: Thu, 29 Apr 2021 22:30:11 +0700 Subject: [PATCH 6/7] Update requirement package --- requirements.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/requirements.txt b/requirements.txt index f177a2371..11e6eb3fe 100644 --- a/requirements.txt +++ b/requirements.txt @@ -38,6 +38,7 @@ tensorboard-plugin-wit==1.8.0 tensorflow==2.4.1 tensorflow-estimator==2.4.0 termcolor==1.1.0 +tf-slim==1.1.0 threadpoolctl==2.1.0 typing-extensions==3.7.4.3 urllib3==1.26.3 From 3567aa95d79503c6c9e7f559ec0e986b694af4c6 Mon Sep 17 00:00:00 2001 From: Sarattha K-Main PC Date: Thu, 29 Apr 2021 22:34:20 +0700 Subject: [PATCH 7/7] Update requirement.txt --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index 11e6eb3fe..ed5641ad1 100644 --- a/requirements.txt +++ b/requirements.txt @@ -35,7 +35,7 @@ scipy==1.6.0 six==1.15.0 tensorboard==2.4.1 tensorboard-plugin-wit==1.8.0 -tensorflow==2.4.1 +tensorflow-gpu==2.4.1 tensorflow-estimator==2.4.0 termcolor==1.1.0 tf-slim==1.1.0