Tensorflow基本模型之Logistic回归

Logistic 回归简介

Logistic模型

Logistic模型图解

损失函数（交叉熵损失）

交叉熵

softmax多分类

softmax

Tensorflow Logistic回归

导入 mnist数据集

import tensorflow as tf

# Import MINST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./data/", one_hot=True)

Extracting ./data/train-images-idx3-ubyte.gz
Extracting ./data/train-labels-idx1-ubyte.gz
Extracting ./data/t10k-images-idx3-ubyte.gz
Extracting ./data/t10k-labels-idx1-ubyte.gz

设置参数

# Parameters
learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_step = 1

构建模型

# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784  # tf.placeholder(dtype, shape=None, name=None)
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes
# Set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

定义损失函数（交叉熵）

1 2	# Minimize error using cross entropy cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))

补充: reduction_indices参数[1]

在tensorflow的使用中，经常会使用tf.reduce_mean,tf.reduce_sum等函数，在函数中，有一个reduction_indices参数，表示函数的处理维度，直接上图，一目了然：

tf.reduce_sum(x) ==> 如果不指定第二个参数，那么就在所有的元素求和

tf.reduce_sum(x, 0) ==> 指定第二个参数为0，则第一维的元素求和，即每一列求和

tf.reduce_sum(x, 1) ==> 指定第二个参数为1，则第二维的元素求和，即每一行求和

设置优化器（SGD）

1 2	# Gradient Descent optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

训练

# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()
# Start training
with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Fit training using batch data
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                          y: batch_ys})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            print ("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print ("Optimization Finished!")

    # Test model
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    # Calculate accuracy for 3000 examples
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) # cast(x, dtype, name=None) 将x的数据格式转化成dtype.
    print ("Accuracy:", accuracy.eval({x: mnist.test.images[:3000], y: mnist.test.labels[:3000]}))

Epoch: 0001 cost= 1.183862086
Epoch: 0002 cost= 0.665185326
Epoch: 0003 cost= 0.552937392
Epoch: 0004 cost= 0.498404927
Epoch: 0005 cost= 0.465865389
Epoch: 0006 cost= 0.442440675
Epoch: 0007 cost= 0.425578112
Epoch: 0008 cost= 0.412035317
Epoch: 0009 cost= 0.401478231
Epoch: 0010 cost= 0.392347213
Epoch: 0011 cost= 0.384493829
Epoch: 0012 cost= 0.377989292
Epoch: 0013 cost= 0.372704204
Epoch: 0014 cost= 0.366971537
Epoch: 0015 cost= 0.362937522
Epoch: 0016 cost= 0.358783882
Epoch: 0017 cost= 0.355023325
Epoch: 0018 cost= 0.351152160
Epoch: 0019 cost= 0.348280402
Epoch: 0020 cost= 0.345466763
Epoch: 0021 cost= 0.342640696
Epoch: 0022 cost= 0.340194521
Epoch: 0023 cost= 0.338306610
Epoch: 0024 cost= 0.335532565
Epoch: 0025 cost= 0.333705268
Optimization Finished!
Accuracy: 0.889

补充 tf.argmax:[2]

简单的说，tf.argmax就是返回最大的那个数值所在的下标。

tf.argmax(array, 1)和tf.argmax(array, 0)的区别看下面的例子：
1
2
3
test = np.array([[1, 2, 3], [2, 3, 4], [5, 4, 3], [8, 7, 2]])
np.argmax(test, 0)　　　＃输出：array([3, 3, 1]
np.argmax(test, 1)　　　＃输出：array([2, 2, 0, 0]

axis = 0:
　　你就这么想，0是最大的范围，所有的数组都要进行比较，只是比较的是这些数组相同位置上的数：
1
2
3
4
5
test[0] = array([1, 2, 3])
test[1] = array([2, 3, 4])
test[2] = array([5, 4, 3])
test[3] = array([8, 7, 2])
# output : [3, 3, 1]
axis = 1:
　　等于1的时候，比较范围缩小了，只会比较每个数组内的数的大小，结果也会根据有几个数组，产生几个结果。
1
2
3
4
test[0] = array([1, 2, 3]) #2
test[1] = array([2, 3, 4]) #2
test[2] = array([5, 4, 3]) #0
test[3] = array([8, 7, 2]) #0

Tensorflow Eager API Logistic回归

from __future__ import absolute_import, division, print_function

import tensorflow as tf
import tensorflow.contrib.eager as tfe

设置 Eager API

1 2	# Set Eager API tfe.enable_eager_execution()

导入数据

1
2
3

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./data/", one_hot=False)

Extracting ./data/train-images-idx3-ubyte.gz
Extracting ./data/train-labels-idx1-ubyte.gz
Extracting ./data/t10k-images-idx3-ubyte.gz
Extracting ./data/t10k-labels-idx1-ubyte.gz

设置变量

# Parameters
learning_rate = 0.1
batch_size = 128
num_steps = 1000
display_step = 100

调用 Dataset API 读取数据[3]

Dataset API是TensorFlow 1.3版本中引入的一个新的模块，主要服务于数据读取，构建输入数据的pipeline。

如果想要用到Eager模式，就必须要使用Dataset API来读取数据。

之前有用 placeholder 读取数据，tf.data.Dataset.from_tensor_slices 是另一种方式，其主要作用是切分传入 Tensor 的第一个维度，生成相应的 dataset。以下面的例子为例，是对 mnist.train.images 按batch_size 进行切分。

在Eager模式中，创建Iterator的方式是通过 tfe.Iterator(dataset) 的形式直接创建Iterator并迭代。迭代时可以直接取出值，不需要使用sess.run()。

# Iterator for the dataset
dataset = tf.data.Dataset.from_tensor_slices(
    (mnist.train.images, mnist.train.labels)).batch(batch_size)
dataset_iter = tfe.Iterator(dataset)

定义模型（公式+损失函数+准确率计算）

# Variables
W = tfe.Variable(tf.zeros([784, 10]), name='weights')
b = tfe.Variable(tf.zeros([10]), name='bias')

# Logistic regression (Wx + b)
def logistic_regression(inputs):
    return tf.matmul(inputs, W) + b

# Cross-Entropy loss function
def loss_fn(inference_fn, inputs, labels):
    # Using sparse_softmax cross entropy
    return tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=inference_fn(inputs), labels=labels))

# Calculate accuracy
def accuracy_fn(inference_fn, inputs, labels):
    prediction = tf.nn.softmax(inference_fn(inputs))
    correct_pred = tf.equal(tf.argmax(prediction, 1), labels)
    return tf.reduce_mean(tf.cast(correct_pred, tf.float32))

补充: tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None):[4]

第一个参数logits：就是神经网络最后一层的输出，如果有batch的话，它的大小就是[batchsize，num_classes]，单样本的话，大小就是num_classes

第二个参数labels：实际的标签，大小同上

执行下面两步操作：

返回值是一个向量,对向量求 tf.reduce_mean，得到loss。

设置优化器（SGD）

# SGD Optimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)

# Compute gradients
grad = tfe.implicit_gradients(loss_fn)

训练

# Training
average_loss = 0.
average_acc = 0.
for step in range(num_steps):

    # Iterate through the dataset
    try:
        d = dataset_iter.next()
    except StopIteration:  # try...except，except用于处理异常
        # Refill queue
        dataset_iter = tfe.Iterator(dataset)
        d = dataset_iter.next()

    # Images
    x_batch = d[0]
    # Labels
    y_batch = tf.cast(d[1], dtype=tf.int64)

    # Compute the batch loss
    batch_loss = loss_fn(logistic_regression, x_batch, y_batch)
    average_loss += batch_loss
    # Compute the batch accuracy
    batch_accuracy = accuracy_fn(logistic_regression, x_batch, y_batch)
    average_acc += batch_accuracy

    if step == 0:
        # Display the initial cost, before optimizing
        print("Initial loss= {:.9f}".format(average_loss))

    # Update the variables following gradients info
    optimizer.apply_gradients(grad(logistic_regression, x_batch, y_batch))

    # Display info
    if (step + 1) % display_step == 0 or step == 0:
        if step > 0:
            average_loss /= display_step
            average_acc /= display_step
        print("Step:", '%04d' % (step + 1), " loss=",
              "{:.9f}".format(average_loss), " accuracy=",
              "{:.4f}".format(average_acc))
        average_loss = 0.
        average_acc = 0.

Initial loss= 2.302585363
Step: 0001  loss= 2.302585363  accuracy= 0.1172
Step: 0100  loss= 0.952338576  accuracy= 0.7955
Step: 0200  loss= 0.535867393  accuracy= 0.8712
Step: 0300  loss= 0.485415280  accuracy= 0.8757
Step: 0400  loss= 0.433947176  accuracy= 0.8843
Step: 0500  loss= 0.381990731  accuracy= 0.8971
Step: 0600  loss= 0.394154936  accuracy= 0.8947
Step: 0700  loss= 0.391497582  accuracy= 0.8905
Step: 0800  loss= 0.386373132  accuracy= 0.8945
Step: 0900  loss= 0.332039326  accuracy= 0.9096
Step: 1000  loss= 0.358993709  accuracy= 0.9002

测试

# Evaluate model on the test image set
testX = mnist.test.images
testY = mnist.test.labels

test_acc = accuracy_fn(logistic_regression, testX, testY)
print("Testset Accuracy: {:.4f}".format(test_acc))

Testset Accuracy: 0.9083

参考

[1] tensorflow reduction_indices理解

[2] tf.argmax()以及axis解析

[3] TensorFlow全新的数据读取方式：Dataset API入门教程

[4] 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法

Logistic 回归 简介

Logistic模型

损失函数（交叉熵损失）

softmax多分类

Tensorflow Logistic回归

导入 mnist数据集

设置参数

构建模型

定义损失函数（交叉熵）

设置优化器（SGD）

训练

Tensorflow Eager API Logistic回归

设置 Eager API

导入数据

设置变量

调用 Dataset API 读取数据[3]

定义模型（公式+损失函数+准确率计算）

设置优化器（SGD）

训练

测试

参考

Logistic 回归简介