Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding an autoencoder tensorflow example #119

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions examples/tensorflow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,11 +61,17 @@ This is Tensorflow implementation of Neural Turing Machine. This implementation
Example of a deep convolutional generative adversarial network. A network able to reproduce detailed images.
Train data can be supplied in a folder with name input or passed as arg with --input_dir, Tweak the values if necessary. Or it will use MNIST dataset as default.


8. **yolo-tensorflow**

YOLO (You Only Look Once) is a state-of-the-are, real-time object detection system. This example trained a yolo model with pascal-voc 2007 data.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should that be State-of-the-Art ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a pull request of 312 lines where ONE of these 312 is clearly a typographical error and, since I didn't disable "Allow edits from maintainers.", is easily fixed by anyone who reviews this pull request. If your only objective here is making a joke, I should say you are at the wrong place.



9. **autoencoder**

Example of an autoencoder in tensorflow. Using by default MNIST, it encodes and decodes image, training the network to be able to compress and rebuild the image.


# How to Run

1. Open the solution.(It will open with Visual Studio 2017 by default.)
Expand Down
2 changes: 2 additions & 0 deletions examples/tensorflow/TensorflowExamples.sln
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ Project("{888888A0-9F3D-457C-B088-3A5042F75D52}") = "NTM", "NTM\NTM.pyproj", "{0
EndProject
Project("{888888A0-9F3D-457C-B088-3A5042F75D52}") = "yolo_tensorflow", "yolo_tensorflow\yolo_tensorflow.pyproj", "{BF2EDCF7-997E-4770-BC53-CFDEB579874D}"
EndProject
Project("{888888A0-9F3D-457C-B088-3A5042F75D52}") = "autoencoder", "autoencoder\autoencoder.pyproj", "{5481A4BB-C273-47A9-B946-5DC841A6B66B}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Expand Down
247 changes: 247 additions & 0 deletions examples/tensorflow/autoencoder/autoencoder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
"""

Example of an autoencoder made in tensorflow.
This example uses the layer api in tensorflow, in order to keep things simpler and easier, but powerfull.

How to use:
Put training data (images) inside 'input' folder (create one if there is none) or pass folder name with --input_dir.
Tweak the values, if necessary.
PROFIT

"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import time
import os

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from PIL import Image


# Define some important stuff
img_size = 28 # 28 because MNIST is composed of 28 x 28 images
img_channel = 1 # 1 because MNIST is grayscale

batch_size = 128
train_steps = 20000 # This number is an overshoot

image_save_step = 1000

# Number of neurons in the first layer.
img_total = img_size * img_size * img_channel

layer_size1 = round(img_total * 0.5)

# Number of neurons in the second layer.
layer_size2 = round(img_total * 0.3)

# Number of neurons in the middle (third) layer (the compressed state).
layer_size3 = round(img_total * 0.10)


# Parse args
parser = argparse.ArgumentParser()
parser.add_argument('--input', type=str, default='', help='Directory with the input data.')
parser.add_argument('--output', type=str, default='output', help='Directory where generated images will be saved.')

FLAGS, unparsed = parser.parse_known_args()
INPUT_DIR = FLAGS.input
OUTPUT_DIR = FLAGS.output


if not os.path.exists(OUTPUT_DIR):
os.makedirs(OUTPUT_DIR)


# To get images
def get_images():

def get_data():

if INPUT_DIR:
data = []

for file in os.listdir(INPUT_DIR):
img = Image.open(INPUT_DIR + "/" + file)
img = img.resize((img_size, img_size), Image.ANTIALIAS) # Resize to target size.
img_data = np.array(img)
data.append(img_data)

return np.array(data)
else: # Should use MNIST
print("No input data was given, script will use MNIST by default.")
(x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
return np.expand_dims(x_train, 3)

# Since the values of the data is in [0, 255],
# we rescale the values of the array,
# so that they are in [-1, 1]
return (get_data() - 127.5) / 127.5


# The encoder will receive an image
# and compress it into a small layer, returning it.
def get_encoder(x):

with tf.name_scope("encoder"):

flatten1 = tf.layers.flatten(x, name="flatten_layer")

dense1 = tf.layers.dense(flatten1, layer_size1, name="first_encoder_layer")
relu1 = tf.nn.relu(dense1)

dense2 = tf.layers.dense(relu1, layer_size2, name="second_encoder_layer")
relu2 = tf.nn.relu(dense2)

dense3 = tf.layers.dense(relu2, layer_size3, name="compressed_encoder_layer") # compressed state
relu3 = tf.nn.relu(dense3)

return relu3


# The decoder will receive the encoded state
# and decode into the original.
def get_decoder(x):

with tf.name_scope("decoder"):

dense1 = tf.layers.dense(x, layer_size2, name="first_decoder_layer")
relu1 = tf.nn.relu(dense1)

dense2 = tf.layers.dense(relu1, layer_size1, name="second_decoder_layer")
relu2 = tf.nn.relu(dense2)

dense3 = tf.layers.dense(relu2, img_size * img_size * img_channel, name="third_decoder_layer")

reshape = tf.reshape(dense3, [-1, img_size, img_size, img_channel], name="image_reshape")
tanh = tf.nn.tanh(reshape) # Tanh so it's limited to [-1, +1], same of input data.

return tanh, reshape


# Define the placeholder
img_ph = tf.placeholder(tf.float32, shape=[None, img_size, img_size, img_channel], name="images_ph")

# Define the network that will be trained
# The encoder will be built with the img placeholder
encoder = get_encoder(img_ph)

# The decoder will receive the encoded data.
decoder_out, decoder_logits = get_decoder(encoder)

# Create a tensor that will be saved to TensorBoard
# Concat the input and output images, allowing us to see the difference.
image_save_op = tf.concat([img_ph, decoder_out], 2, name="image_saver_op")

# Now, we need to define the loss
with tf.name_scope("loss"):
# A simple mean squared error
# Since what we want is the network to rebuild the image, the target is img_ph, the input.
# Using logits so that the backprop is more efficient.
loss_op = tf.losses.mean_squared_error(img_ph, decoder_logits)

# The optimizer used will be Adam, change the learn rate if needed.
with tf.name_scope("optimizer"):
opt_op = tf.train.AdamOptimizer(learning_rate=0.002).minimize(loss_op)

# We then load the data
x_train = get_images()

# Op to initializer all variables
init = tf.global_variables_initializer()

# Session time!
with tf.Session() as sess:

# Run the init
sess.run(init)

# TensorBoard Stuff

# FileWriter to save the log
writer = tf.summary.FileWriter("logs/", graph=sess.graph)

# A summary to save generated images
image_saver_summary = tf.summary.image("Decoded Images", image_save_op, max_outputs=9)

# A summary to save the loss
merged_sum = tf.summary.merge([tf.summary.scalar("Loss", loss_op)])

# End of TensorBoard Stuff

# To memorize the start time
train_start_time = time.time()

# Defines the main train loop
for step in range(train_steps):

# Random number to select the images
random_number = np.random.randint(0, len(x_train) - batch_size)

# Slice the train data to create a batch
img_batch = x_train[random_number: random_number + batch_size]

# Train the network
_, current_loss, summary = sess.run([opt_op, loss_op, merged_sum],
feed_dict={img_ph: img_batch})

# Saves TensorBoard stuff
writer.add_summary(summary, step)

# Print info about the step
print("Step {0}, LOSS {1:.15}".format(step, current_loss))

# Occasionally saves images to TensorBoard
if step % image_save_step == 0 or step == train_steps - 1:
print("SAVING IMAGE...")

# Random number to select the images
random_number = np.random.randint(0, len(x_train) - batch_size)

# Slice the train data to create a batch
img_batch = x_train[random_number: random_number + batch_size]

# Run the summary op.
images, save_sum = sess.run([image_save_op, image_saver_summary], feed_dict={img_ph: img_batch})

# Saves sample images to TensorBoard
writer.add_summary(save_sum, step)

# Save with matplotlib
# Rescale the images to [0, 1] (needed for matplotlib)
images = (images * 0.5) + 0.5

# Because grayscale
images = np.squeeze(images) # Comment this line if your dataset is not GRAYSCALE

# Number of images to save
r, c = 4, 4

# Creates subplots
fig, axs = plt.subplots(r, c)

# Loop to save the images
cnt = 0
for i in range(r):
for j in range(c):
axs[i, j].imshow(images[cnt], cmap="gray") # Remove the 'cmap' if your dataset is not GRAYSCALE
axs[i, j].axis('off')
cnt += 1

# Set title
fig.suptitle("Sample after {0} training steps".format(step), fontsize=14)
# Saves
fig.savefig("{0}/generated_{1}.png".format(OUTPUT_DIR, step))
# Close the plt
plt.close()


# Print
spent_time = time.time() - train_start_time
print("Trained done in {0} seconds, average of {1} steps/second".format(spent_time, train_steps / spent_time))
35 changes: 35 additions & 0 deletions examples/tensorflow/autoencoder/autoencoder.pyproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="4.0">
<PropertyGroup>
<Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>
<SchemaVersion>2.0</SchemaVersion>
<ProjectGuid>5481a4bb-c273-47a9-b946-5dc841a6b66b</ProjectGuid>
<ProjectHome>.</ProjectHome>
<StartupFile>autoencoder.py</StartupFile>
<SearchPath>
</SearchPath>
<WorkingDirectory>.</WorkingDirectory>
<OutputPath>.</OutputPath>
<Name>autoencoder</Name>
<RootNamespace>autoencoder</RootNamespace>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)' == 'Debug' ">
<DebugSymbols>true</DebugSymbols>
<EnableUnmanagedDebugging>false</EnableUnmanagedDebugging>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)' == 'Release' ">
<DebugSymbols>true</DebugSymbols>
<EnableUnmanagedDebugging>false</EnableUnmanagedDebugging>
</PropertyGroup>
<ItemGroup>
<Compile Include="autoencoder.py" />
</ItemGroup>
<Import Project="$(MSBuildExtensionsPath32)\Microsoft\VisualStudio\v$(VisualStudioVersion)\Python Tools\Microsoft.PythonTools.targets" />
<!-- Uncomment the CoreCompile target to enable the Build command in
Visual Studio and specify your pre- and post-build commands in
the BeforeBuild and AfterBuild targets below. -->
<!--<Target Name="CoreCompile" />-->
<Target Name="BeforeBuild">
</Target>
<Target Name="AfterBuild">
</Target>
</Project>
22 changes: 22 additions & 0 deletions examples/tensorflow/autoencoder/tensorflow.autoencoder.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"jobName": "tensorflow-autoencoder",
"image": "openpai/pai.example.tensorflow",
"codeDir": "$PAI_DEFAULT_FS_URI/tiqint/autoencoder/code",
"authFile": "",
"dataDir": "$PAI_DEFAULT_FS_URI/tiqint/autoencoder/input",
"outputDir": "$PAI_DEFAULT_FS_URI/tiqint/autoencoder/output",
"gpuType": "",
"retryCount": 0,
"taskRoles": [
{
"name": "autoencoder_train",
"taskNumber": 1,
"cpuNumber": 8,
"memoryMB": 32768,
"gpuNumber": 1,
"command": "python code/autoencoder.py --input_dir=$PAI_DATA_DIR --output_dir=$PAI_OUTPUT_DIR",
"minSucceededTaskCount": 1,
"minFailedTaskCount": 1
}
]
}