Neural Network : Scaling & Gradient descent optimization

Graph Representation of Gradient Descent Variants

Introduction

Feature Scaling

Z-score normalization Formula
tf.scale_to_z_score(x)

Batch normalization

Batch Normalization Formula
tf.nn.batch_normalization(x, mean, variance, offset, scale, variance_epsilon)

Mini-batch Gradient Descent

Mini-Batch Gradient Descent

Gradient Descent with Momentum

Gradient Descent with Momentum Formula
tf.train.MomentumOptimizer(learning_rate, momentum).minimize(loss)

RMSProp Gradient Descent

RMSProp Gradient Descent Formula
tf.train.RMSPropOptimizer(learning_rate, decay, epsilon).minimize(loss)

Adam Gradient Descent

Adam Gradient Descent Formula
tf.train.AdamOptimizer(learning_rate, beta1, beta2, epsilon).minimize(loss)

Learning Rate Decay

alpha / (1 + decay_rate * np.floor(global_step / decay_step))
tf.train.inverse_time_decay(alpha, global_step, decay_step, decay_rate)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store