Skip to content
This repository has been archived by the owner on Dec 12, 2022. It is now read-only.

Various tensorflow.js demos with NuxtJS and the modern JavaScript (Webpack setup)

Notifications You must be signed in to change notification settings

dmijatovic/tensorflow-demos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tensorflow testing

This repo is used to lern tensorflow and test use with

  • JS/Webpack approach
  • Test use of tfjs in web worker
  • Test use of tfjs with SSR. Nuxt
  • Test use of NodeJS as backend

Major discoveries

Tensorflow offers different activation functions. During the course these functions are discussed. They do have specific purpose. However with deep neural networks these activation functions (sigmoid, softmax) should be applied on the last layer (output layer). In the 'middle' layers other activation functions can be used. However main problem with deep networks is 'Vanishing gradient problem'. Over the time new activaition functions are produced to address this problem.

  • sigmoid: this was first activation used in middle layers (value 0 to 1). It makes network non-linear.
  • tahn: function came as improvement (values -1 to 1)
  • relu/relu6: was solution for deep networks to remidiate weaking of weights (due to multiplication of layer weight with values < 1). Relu values are always > 0 and no max cap (can be infinity)
  • elu (exponentional linear unit): function in the form of hockey stick. It alows negative values.
  • softplus: another version for preventing gradient los in deep networks.

Regression

MSE (Mean square error)

Basic regression approach using mean square error.

mse = SUM((En - An)*2) / N

  • En - estimated case value
  • An - actual case value
  • N - number of cases

R squared

Used to indicate fit of the model. The values are from -Infinity to 1, where 1 means perfect fit.

Rsq = 1 - (SSres / SStot)

  • SStot: total squared sum of diff from avarage (actual - avergage)**2
  • SSres: total squared sum of diff from predicted (actual - predicted)**2

Learning rate optimization methods

In our project we used simple custom method to increase or descrese learning rate. The calculation is done in tf-utils.js tuneLearningRate function.

Batch and stochastic gradient descent

  • batch (mini-batch): split data into smaller batches and update weights after each batch. Here we use epochs to indicate how many runs we do?
  • stochastic: update weights on each record, can be seen as batch with batch size of 1 record.

Logistic regression

Classification problem. There is definitive amount of labels (classes) that model can choose from. The difference between linear and logistic regression approach:

Linear Logistic
Method MSE (mean-squared-error) = Sum(Guess(i) - Actual(i))**2 / no. cases Cross Entropy
Predictor fn y = rc1 * X1 + rc2 * X2 ... + constant 1 / (1 + e ** -y)

Sigmoid function

It produces the values between 0 and 1.

Sig = 1 / (1 + e** -z)

  • e: Oieler constant (2.718)
  • z: is the value based on linear regression formula

Sigmoid function is built-in the tensorflow. The approach to gradient descent is similair to mse approach. The only difference, in our implementation, is that sigmoid function is applied on the regression guess in order to produce values between 0 and 1.

Cross Entropy

Used in logistic regression to indicate how well predicition fits the actual values. Gradient descent function looks for minimum cross entropy value.

CE = actual(i) x log(guess) + ((1 - actual(i)) x log(1 - guess)) / total cases

Cost function

This is general term for the formula to calculate deviation (error) between prediction and actual value (label). In linear regression model we use MSE (mean-square-error) as cost functions and with logistic regression we use Cross Entropy function. In both cases the goal of gradient descent is to reduce cost function value to minimum.

Using sigmoid when clasifying estimation above binary distribution produces less reliable output. Sigmoid is used to calculate marginal probablity, eg. probability of a single outcome (chance of item being true). When we want to predict one of the options which are mutualy exclusive, sigmoid produces combined probabilities which sum is > 1 (100%). To take other options into a cound, we use conditional probabilities function Softmax.

Smax = (e** mx+b) / Sum( (e** mx1+b1) + (e** mx2+b2) ... + (e** mxk+bk))

  • k: klasses, number of categories we need to classify
  • e: Oeiler constant
  • mx+b: outcome of basic linear equation

The total value of all probabilites equals to 1 (100%).

argMax function

We are working with multiple categories in binary form (multiple colums with values 0 - 1). We can use argMax function of tensorflow to retreive one dimensional tensor with colom index value of the class with the highest probability value. This help us answer question: which category has the highest probability.

Other useful tensorflow functions are:

  • notEqual

Memory use of tensorflow

Tensorflow holds refrence to all tensors created in an session, even if these are created in local functions/method. In order to remove the tensors from the memory wrap the calculations into tidy() method.

About

Various tensorflow.js demos with NuxtJS and the modern JavaScript (Webpack setup)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published