Logistic Regression as Shallow Neural Network with Tensorflow


Objectives:

  • Here we implement a Logistic Regression model using Tensorflow in a way that is understandable for beginners.
  • We adopt a neural network scheme to construct the model so as the get used to thinking of neural networks.
  • We show how to import the MNIST data and prepare it for training.
  • We show how to initialize parameter / hyperparamaters.
  • We describe the variables and inputs of the graph.
  • We construct the model and initialize all variables.
  • We then declare all our summaries and provide the intuition behind creating namescopes
  • Then we finally train the models and provide accuracy if the Logistic Regression model
  • The repository for this tutorial can be found here.

Data

In [1]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
from datetime import datetime
now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)

tf.reset_default_graph() # for resetting the graph
In [2]:
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
In [3]:
mnist.train.images.shape
Out[3]:
(55000, 784)
In [4]:
mnist.train.labels.shape
Out[4]:
(55000, 10)

Initializing parameters

In [5]:
# parameters
training_epochs = 25
learning_rate = 0.01
batch_size = 100
display_step = 1
In [6]:
# network parameters
n_input = 784 # MNIST (img shape: 28*28)
n_layer_1 = 1
n_classes = 10

Input for graphs

In [7]:
# train
with tf.name_scope("X"):
    X = tf.placeholder(tf.float32, [None, n_input])

# labels
with tf.name_scope("Y"):
    Y = tf.placeholder(tf.float32, [None, n_classes]) # 0-9 classes

# weights and bias
with tf.name_scope("W"):
    weights = {
        "w1": tf.Variable(tf.zeros([n_input, n_classes]))
    }
    
with tf.name_scope("b"):
    biases = {  
        "b1": tf.Variable(tf.zeros([n_classes])),
    }

Construct Model

Loss funciton based on the following (cross-entropy loss): alt txt

In [8]:
# FIXME: can probably make this into a class
def logistic_regression(x):
    # one layer with one unit for logistic regression
    with tf.name_scope("Layer_1"):
        layer_1 = tf.matmul(x, weights["w1"]) + biases["b1"]
    return layer_1
In [9]:
# construct model
logits = logistic_regression(X)

# define loss and optimizer
# One way to obtain cost
# pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
# cost = tf.reduce_mean(-tf.reduce_sum(Y*tf.log(pred), reduction_indices=1)) 

with tf.name_scope("cost"):
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
with tf.name_scope("gradients"):
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
    train_op = optimizer.minimize(cost)
In [10]:
# initializing the variables
init = tf.global_variables_initializer()

Declare all summaries to output

In [11]:
cost_summary = tf.summary.scalar('loss', cost)
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())

Training

In [12]:
with tf.Session() as sess:

    # Run the initializer
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            
            # write summaries
            if i % 10 == 0:
                summary_str = cost_summary.eval(feed_dict={X: batch_xs, Y: batch_ys})
                step = epoch * total_batch + i
                file_writer.add_summary(summary_str, step)
            
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c = sess.run([train_op, cost], feed_dict={X: batch_xs, Y: batch_ys})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print("Optimization Finished!")

    # Test model
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(Y, 1))
    # Calculate accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print("Accuracy:", accuracy.eval({X: mnist.test.images, Y: mnist.test.labels}))

# close filewriter
file_writer.close()
Epoch: 0001 cost= 1.183490094
Epoch: 0002 cost= 0.665206295
Epoch: 0003 cost= 0.552762811
Epoch: 0004 cost= 0.498679131
Epoch: 0005 cost= 0.465485898
Epoch: 0006 cost= 0.442519813
Epoch: 0007 cost= 0.425530678
Epoch: 0008 cost= 0.412206382
Epoch: 0009 cost= 0.401436379
Epoch: 0010 cost= 0.392426473
Epoch: 0011 cost= 0.384776898
Epoch: 0012 cost= 0.378167631
Epoch: 0013 cost= 0.372406273
Epoch: 0014 cost= 0.367344910
Epoch: 0015 cost= 0.362728000
Epoch: 0016 cost= 0.358642407
Epoch: 0017 cost= 0.354884652
Epoch: 0018 cost= 0.351449514
Epoch: 0019 cost= 0.348286129
Epoch: 0020 cost= 0.345458569
Epoch: 0021 cost= 0.342742043
Epoch: 0022 cost= 0.340288821
Epoch: 0023 cost= 0.337941553
Epoch: 0024 cost= 0.335721518
Epoch: 0025 cost= 0.333687516
Optimization Finished!
Accuracy: 0.9136

TODO:

  • Improve modularization by creating classes for model and so forth.
  • Provide detailed gradient outputs to create checkpoints.