当前位置: 开发笔记 > 编程语言 > 正文

使用张量流的软最大回归

作者：温思家羽绒家纺旗舰店 | 来源：互联网 | 2023-10-12 15:51

使用张量流的软最大回归原文:https://www.gees

使用张量流的软最大回归

原文:https://www . geesforgeks . org/soft max-revolution-use-tensorflow/

本文讨论了 Softmax 回归的基础知识及其在 Python 中使用 TensorFlow 库的实现。

什么是 Softmax 回归？

Softmax 回归(或多项式逻辑回归)是逻辑回归对我们要处理多个类的情况的推广。

这里可以找到对线性回归的温和介绍:
理解逻辑回归

在二元逻辑回归中，我们假设标签是二元的，即 $i^{th}$ 观察，
T2

但是考虑一个场景，我们需要从两个或更多的类标签中对一个观察进行分类。例如，数字分类。这里，可能的标签有:
$y_{i} \epsilon \begin{Bmatrix} 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 \end{Bmatrix}$

在这种情况下，我们可以使用 Softmax 回归。

让我们首先定义我们的模型:

让数据集具有“m”个特征和“n”个观察值。此外，还有“k”类标签，也就是说，每个观察值都可以归类为“k”个可能的目标值之一。例如，如果我们有一个向量大小为 28×28 的 100 个手写数字图像的数据集用于数字分类，那么我们有，n = 100，m = 28×28 = 784，k = 10。

特征矩阵 特征矩阵表示为:这里 $x_{ij}$ 表示 $i^{th}$ 观察的 $j^{th}$ 特征的值。矩阵有尺寸: $n\space X \space (m+1)$

权重矩阵 我们将权重矩阵定义为:这里， $w_{ij}$ 表示为 $j^{th}$ 类别标签分配给 $i^{th}$ 特征的权重。矩阵有尺寸: $(m+1)\space X \space k$ 。最初，使用一些正态分布填充权重矩阵。

Logit score matrix Then, we define our net input matrix(also called logit score matrix), , as:
py Z = XW
矩阵有维度: $n \space X \space k$ 。
目前，我们在特征矩阵中多取一列，在权重矩阵中多取一行。这些额外的列和行对应于与每个预测相关联的偏差项。这可以通过定义一个额外的偏差矩阵来简化，大小为 $n \space X \space k$ $b_{ij} = w_{0j}$ 。(实际上，我们只需要一个大小的向量和一些偏置项的广播技巧！)
所以，最终的分数矩阵，是:
Z = XW+b
其中矩阵有维度 $n\space X \space m$ ，而有维度 $m\space X \space k$ 。但是矩阵仍然有相同的值和维度！
但是矩阵意味着什么？实际上， $Z_{ij}$ 是标签 j 对于 $i^{th}$ 观察的可能性。它不是一个合适的概率值，但可以被认为是每个观察给每个类别标签的分数！
让我们将定义为用于 $i^{th}$ 观察的逻辑得分向量。
例如，让向量代表手写数字分类问题中每个类别标签的分数，用于 $5^{th}$ 观察。这里，最大分数是 5.2，对应于类别标签“3”。因此，我们的模型目前预测 $5^{th}$ 观察/图像为‘3’。

Softmax layer It is harder to train the model using score values since it is hard to differentiate them while implementing Gradient Descent algorithm for minimizing the cost function. So, we need some function which normalizes the logit scores as well as makes them easily differentiable!In order to convert the score matrix to probabilities, we use Softmax function.
对于一个向量，softmax 函数定义为:那么，softmax 函数将做 2 件事:
```py
1. convert all scores to probabilities.
2. sum of all probabilities is 1.
```
回想一下，在二元逻辑分类器中，我们对同一任务使用了 sigmoid 函数。Softmax 函数不过是 sigmoid 函数的推广！现在，这个 softmax 函数计算 $i^{th}$ 训练样本属于类的概率，给定逻辑向量为:

在向量形式中，我们可以简单地写:

为了简单起见，让表示 softmax 概率向量用于 $i^{th}$ 观察。

One-hot encoded target matrix Since softmax function provides us with a vector of probabilities of each class label for a given observation, we need to convert target vector in the same format to calculate the cost function! Corresponding to each observation, there is a target vector (instead of a target value!) composed of only zeros and ones where only correct label is set as 1. This technique is called one-hot encoding.See the diagram given below for a better understanding:
现在，我们将 $i^{th}$ 观察的单热编码向量表示为

成本函数 现在，我们需要定义一个成本函数，对于这个成本函数，我们必须比较软最大概率和一个热编码的目标向量的相似性。我们同样使用交叉熵的概念。交叉熵是一个距离计算函数，它从 softmax 函数和创建的一个热编码矩阵中获取计算的概率来计算距离。对于正确的目标类，距离值会更小，而对于错误的目标类，距离值会更大。我们用软最大概率向量和单热点目标向量为 $i^{th}$ 观测定义交叉熵为:

而现在，成本函数，可以定义为平均交叉熵，即:

，任务就是最小化这个成本函数！

梯度下降算法 为了通过梯度下降学习我们的 softmax 模型，我们需要计算导数:和，然后我们使用它们来更新梯度相反方向的权重和偏差:和对于每个类，其中 $j \in {1,2,..,k}$ 和 $\alpha$ 是学习率。使用这个成本梯度，我们迭代地更新权重矩阵，直到我们达到指定数量的时期(通过训练集)或达到期望的成本阈值。

实施

现在让我们使用张量流库在 MNIST 手写数字数据集上实现 Softmax 回归。

对于 TensorFlow 的温和介绍，请遵循本教程:
TensorFlow 介绍

步骤 1:导入依赖关系

首先，我们导入依赖项。

import tensorflow as tf import numpy as np import matplotlib.pyplot as plt

第二步:下载数据

TensorFlow 允许您自动下载和读取 MNIST 数据。考虑下面给出的代码。它会将数据下载并保存到您当前项目目录中的文件夹 MNIST_data 中，并将其加载到当前程序中。

from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz Extracting MNIST_data/train-labels-idx1-ubyte.gz Extracting MNIST_data/t10k-images-idx3-ubyte.gz Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

第三步:了解数据

现在，我们试图理解数据集的结构。

MNIST 数据分为三部分:55000 个数据点的训练数据( mnist.train )、10000 个数据点的测试数据( mnist.test )和 5000 个数据点的验证数据( mnist.validation )。

每幅图像都是 28 像素乘 28 像素，已经被展平成尺寸为 784 的一维数字阵列。类别标签的数量为 10。每个目标标签已经以一次性编码的形式提供。

print("Shape of feature matrix:", mnist.train.images.shape) print("Shape of target matrix:", mnist.train.labels.shape) print("One-hot encoding for 1st observation:\n", mnist.train.labels[0]) # visualize data by plotting images fig,ax = plt.subplots(10,10) k = 0 for i in range(10): for j in range(10): ax[i][j].imshow(mnist.train.images[k].reshape(28,28), aspect='auto') k += 1 plt.show()

输出:

Shape of feature matrix: (55000, 784) Shape of target matrix: (55000, 10) One-hot encoding for 1st observation: [ 0\. 0\. 0\. 0\. 0\. 0\. 0\. 1\. 0\. 0.]

第四步:定义计算图

现在，我们创建一个计算图。

# number of features num_features = 784 # number of target labels num_labels = 10 # learning rate (alpha) learning_rate = 0.05 # batch size batch_size = 128 # number of epochs num_steps = 5001 # input data train_dataset = mnist.train.images train_labels = mnist.train.labels test_dataset = mnist.test.images test_labels = mnist.test.labels valid_dataset = mnist.validation.images valid_labels = mnist.validation.labels # initialize a tensorflow graph graph = tf.Graph() with graph.as_default(): """ defining all the nodes """ # Inputs tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, num_features)) tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels)) tf_valid_dataset = tf.constant(valid_dataset) tf_test_dataset = tf.constant(test_dataset) # Variables. weights = tf.Variable(tf.truncated_normal([num_features, num_labels])) biases = tf.Variable(tf.zeros([num_labels])) # Training computation. logits = tf.matmul(tf_train_dataset, weights) + biases loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( labels=tf_train_labels, logits=logits)) # Optimizer. optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) # Predictions for the training, validation, and test data. train_prediction = tf.nn.softmax(logits) valid_prediction = tf.nn.softmax(tf.matmul(tf_valid_dataset, weights) + biases) test_prediction = tf.nn.softmax(tf.matmul(tf_test_dataset, weights) + biases)

需要注意的一些要点:

For the training data, we use a placeholder that will be fed at run time with a training minibatch. The technique of using minibatches for training model using gradient desc Enttermedia as stochastic gradient descent .
In gradient descent (GD) and random gradient descent (SGD), you update a set of parameters iteratively to minimize the error function. In GD, you must traverse all the samples in the training set to update the parameters in a specific iteration, while in SGD, you can only use one or a subset of the training samples in the training set to update the parameters in a specific iteration. If SUBSET is used, it is called Minibatch random gradient descent. Therefore, if the number of training samples is very large, in fact, it may take too long to use gradient descent, because in each iteration, when you update the parameter values, you are running the entire training set. On the other hand, it is faster to use SGD because you only use one training sample, and it will improve itself from the first sample. Compared with GD, SGD usually converges faster, but the error function is not as small as that of GD. Usually, in most cases, the approximation of the parameter values you get in SGD is enough, because they reach the optimal value and keep oscillating there.

权重矩阵使用服从(截断的)正态分布的随机值初始化。这是使用方法实现的。使用TF . zero方法将偏差初始化为零。

现在，我们将输入乘以权重矩阵，并加上偏差。我们使用TF . nn . softmax _ cross _ 熵 _with_logits 计算 soft max 和交叉熵(这是 TensorFlow 中的一个操作，因为它很常见，可以优化)。我们使用 tf.reduce_mean 方法对所有训练示例取此交叉熵的平均值。

我们将使用梯度下降来最小化损失。为此，我们使用TF . train . gradientdescentoptimizer。

train_prediction 、 valid_prediction 和 test_prediction 不是训练的一部分，而仅仅是为了我们在训练的时候能够报告精确的数字。

第五步:运行计算图

既然我们已经构建了计算图，现在是时候运行它了。

# utility function to calculate accuracy def accuracy(predictions, labels): correctly_predicted = np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1)) accu = (100.0 * correctly_predicted) / predictions.shape[0] return accu with tf.Session(graph=graph) as session: # initialize weights and biases tf.global_variables_initializer().run() print("Initialized") for step in range(num_steps): # pick a randomized offset offset = np.random.randint(0, train_labels.shape[0] - batch_size - 1) # Generate a minibatch. batch_data = train_dataset[offset:(offset + batch_size), :] batch_labels = train_labels[offset:(offset + batch_size), :] # Prepare the feed dict feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels} # run one step of computation _, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict) if (step % 500 == 0): print("Minibatch loss at step {0}: {1}".format(step, l)) print("Minibatch accuracy: {:.1f}%".format( accuracy(predictions, batch_labels))) print("Validation accuracy: {:.1f}%".format( accuracy(valid_prediction.eval(), valid_labels))) print("\nTest accuracy: {:.1f}%".format( accuracy(test_prediction.eval(), test_labels)))

输出:

Initialized Minibatch loss at step 0: 11.68728256225586 Minibatch accuracy: 10.2% Validation accuracy: 14.3% Minibatch loss at step 500: 2.239773750305176 Minibatch accuracy: 46.9% Validation accuracy: 67.6% Minibatch loss at step 1000: 1.0917563438415527 Minibatch accuracy: 78.1% Validation accuracy: 75.0% Minibatch loss at step 1500: 0.6598564386367798 Minibatch accuracy: 78.9% Validation accuracy: 78.6% Minibatch loss at step 2000: 0.24766433238983154 Minibatch accuracy: 91.4% Validation accuracy: 81.0% Minibatch loss at step 2500: 0.6181786060333252 Minibatch accuracy: 84.4% Validation accuracy: 82.5% Minibatch loss at step 3000: 0.9605385065078735 Minibatch accuracy: 85.2% Validation accuracy: 83.9% Minibatch loss at step 3500: 0.6315320730209351 Minibatch accuracy: 85.2% Validation accuracy: 84.4% Minibatch loss at step 4000: 0.812285840511322 Minibatch accuracy: 82.8% Validation accuracy: 85.0% Minibatch loss at step 4500: 0.5949224233627319 Minibatch accuracy: 80.5% Validation accuracy: 85.6% Minibatch loss at step 5000: 0.47554320096969604 Minibatch accuracy: 89.1% Validation accuracy: 86.2% Test accuracy: 86.5%

需要注意的一些要点:

在每次迭代中，通过使用 np.random.randint 方法选择随机偏移值来选择迷你批次。

为了馈送占位符 tf_train_dataset 和 tf_train_label ，我们创建了一个 feed_dict 如下所示:
```py
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
```

A shortcut way of performing one step of computation is:
```py
_, l, predictiOns= session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
```
在执行优化步骤后，该节点返回损失和预测的新值。

这将我们带到实现的结尾。完整的代码可以在这里找到。

最后，这里有几点值得思考:

你可以试着调整学习速率、批量、时代数量等参数，取得更好的效果。您也可以尝试不同的优化器，如。

上述模型的精度可以通过使用具有一个或多个隐藏层的神经网络来提高。我们将在接下来的一些文章中讨论它使用 TensorFlow 的实现。

Softmax Regression vs. k Binary Classifiers One should be aware of the scenarios where softmax regression works and where it doesn’t. In many cases, you may need to use k different binary logistic classifiers for each of the k possible values of the class label.
假设您正在处理一个计算机视觉问题，您试图将图像分为三个不同的类别:
案例一:假设你的班级是室内 _ 场景、室外 _ 城市 _ 场景、室外 _ 荒野 _ 场景。
案例二:假设你的班级是室内 _ 场景、黑白 _ 图像、图像 _ 有 _ 人。
哪种情况下你会使用软最大回归分类器，哪种情况下你会使用 3 二元逻辑回归分类器？
这将取决于这 3 个类是否互斥。
在情况 1 中，场景可以是室内 _ 场景、室外 _ 城市 _ 场景或室外 _ 荒野 _ 场景。因此，假设每个训练示例都用 3 个类中的一个来标记，我们应该构建一个 k = 3 的 softmax 分类器。
然而，在情况 2 中，类并不相互排斥，因为一个场景既可以是室内的，也可以有人在其中。因此，在这种情况下，构建 3 个二元逻辑回归分类器会更合适。这样，对于每个新场景，您的算法可以分别决定它是否属于 3 个类别中的每一个。

参考文献:

http://www . kdkings . com/2016/07/soft max-回归相关-logistic-relationship . html

https://classroom.udacity.com/courses/ud730

http://ufldl.stanford.edu/wiki/index.php/Softmax_Regression

本文由 尼基尔·库马尔 供稿。如果你喜欢 GeeksforGeeks 并想投稿，你也可以使用write.geeksforgeeks.org写一篇文章或者把你的文章邮寄到 review-team@geeksforgeeks.org。看到你的文章出现在极客博客主页上，帮助其他极客。

如果你发现任何不正确的地方，或者你想分享更多关于上面讨论的话题的信息，请写评论。

推荐阅读

ip
开源Keras Faster RCNN模型介绍及代码结构解析

本文介绍了开源Keras Faster RCNN模型的环境需求和代码结构，包括FasterRCNN源码解析、RPN与classifier定义、data_generators.py文件的功能以及损失计算。同时提供了该模型的开源地址和安装所需的库。 ... [详细]

蜡笔小新 2023-12-10 17:44:07
io
YOLOv7基于自己的数据集从零构建模型完整训练、推理计算超详细教程

本文介绍了关于人工智能、神经网络和深度学习的知识点，并提供了YOLOv7基于自己的数据集从零构建模型完整训练、推理计算的详细教程。文章还提到了郑州最低生活保障的话题。对于从事目标检测任务的人来说，YOLO是一个熟悉的模型。文章还提到了yolov4和yolov6的相关内容，以及选择模型的优化思路。 ... [详细]

蜡笔小新 2023-12-14 18:28:01
byte
开发笔记:加密&json&StringIO模块&BytesIO模块

篇首语：本文由编程笔记#小编为大家整理，主要介绍了加密&json&StringIO模块&BytesIO模块相关的知识，希望对你有一定的参考价值。一、加密加密 ... [详细]

蜡笔小新 2023-12-14 15:18:35
io
也就是|小窗_卷积的特征提取与参数计算

篇首语：本文由编程笔记#小编为大家整理，主要介绍了卷积的特征提取与参数计算相关的知识，希望对你有一定的参考价值。Dense和Conv2D根本区别在于，Den ... [详细]

蜡笔小新 2023-12-13 12:59:48
char
Python自动提取文本中的时间（包含中文日期）及特殊时间识别方法

本文介绍了在处理不规则数据时如何使用Python自动提取文本中的时间日期，包括使用dateutil.parser模块统一日期字符串格式和使用datefinder模块提取日期。同时，还介绍了一段使用正则表达式的代码，可以支持中文日期和一些特殊的时间识别，例如'2012年12月12日'、'3小时前'、'在2012/12/13哈哈'等。 ... [详细]

蜡笔小新 2023-12-12 12:09:33
byte
android studio生成jks,android studio生成 keystore 以及获取 SHA1值等

合并列值－合并为一列问题需求：createtabletab(Aint,Bint,Cint)inserttabselect1,2,3unionallsel ... [详细]

蜡笔小新 2023-12-11 12:32:55
io
Python张量流中的device spec make_merged_spec()方法使用说明

本文介绍了在Python张量流中使用make_merged_spec()方法合并设备规格对象的方法和语法，以及参数和返回值的说明，并提供了一个示例代码。 ... [详细]

蜡笔小新 2023-12-11 12:15:19
join
超级简单加解密工具的方案和功能

本文介绍了一个超级简单的加解密工具的方案和功能。该工具可以读取文件头，并根据特定长度进行加密，加密后将加密部分写入源文件。同时，该工具也支持解密操作。加密和解密过程是可逆的。本文还提到了一些相关的功能和使用方法，并给出了Python代码示例。 ... [详细]

蜡笔小新 2023-12-10 16:38:34
range
利用ARMA模型对平稳非白噪声序列进行建模的步骤及代码实现

本文介绍了利用ARMA模型对平稳非白噪声序列进行建模的步骤及代码实现。首先对观察值序列进行样本自相关系数和样本偏自相关系数的计算，然后根据这些系数的性质选择适当的ARMA模型进行拟合，并估计模型中的位置参数。接着进行模型的有效性检验，如果不通过则重新选择模型再拟合，如果通过则进行模型优化。最后利用拟合模型预测序列的未来走势。文章还介绍了绘制时序图、平稳性检验、白噪声检验、确定ARMA阶数和预测未来走势的代码实现。 ... [详细]

蜡笔小新 2023-12-09 08:30:08
process
python 终止函数命令_如何使“停止”按钮终止已经在Tkinter（Python）中运行的“启动”函数...

我用Tkinter制作了一个图形用户界面，有两个主按钮：“开始”和“停止”。请您就如何使用“停止”按钮终止“开始”按钮为以下代码调用的已运行功能提供建议 ... [详细]

蜡笔小新 2023-10-17 20:02:38
io
推荐系统遇上深度学习(十七）详解推荐系统中的常用评测指标

原创：石晓文小小挖掘机2018-06-18笔者是一个痴迷于挖掘数据中的价值的学习人，希望在平日的工作学习中，挖掘数据的价值， ... [详细]

蜡笔小新 2023-12-13 19:35:25
io
python创建一个窗口_等一个大佬啊要求用python创建一个窗口，窗口按钮功能是创建一个球体或立方体。明天上课之前交给我...

展开全部下面的代码是创建一个立方体Thisexamplescreatesanddisplaysasimplebox.#Thefirstlineloadstheinit_disp ... [详细]

蜡笔小新 2023-12-13 16:26:09
ip
不同优化算法的比较分析及实验验证

本文介绍了神经网络优化中常用的优化方法，包括学习率调整和梯度估计修正，并通过实验验证了不同优化算法的效果。实验结果表明，Adam算法在综合考虑学习率调整和梯度估计修正方面表现较好。该研究对于优化神经网络的训练过程具有指导意义。 ... [详细]

蜡笔小新 2023-12-13 16:05:14
ip
Python瓦片图下载、合并、绘图、标记的代码示例

本文提供了Python瓦片图下载、合并、绘图、标记的代码示例，包括下载代码、多线程下载、图像处理等功能。通过参考geoserver，使用PIL、cv2、numpy、gdal、osr等库实现了瓦片图的下载、合并、绘图和标记功能。代码示例详细介绍了各个功能的实现方法，供读者参考使用。 ... [详细]

蜡笔小新 2023-12-13 12:14:55
window
Window10+anaconda+python3.5.4+ tensorflow1.5+ keras(GPU版本）安装教程

Window10+anaconda+python3.5.4+ tensorflow1.5+ keras(GPU版本）安装教程 ... [详细]

蜡笔小新 2023-10-17 21:10:23

温思家羽绒家纺旗舰店

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章