DETR特征图可视化代码

作者：mobiledu2502898543 | 来源：互联网 | 2023-10-09 18:56

一共分为5个步骤，加载DETR模型及获取训练好的参数下载待检测的图片并进行预处理和前馈过程得到预测结果准备好前馈该图片时网络的各类参数（重点*

一共分为5个步骤&＃xff0c;

加载DETR模型及获取训练好的参数
下载待检测的图片并进行预处理和前馈过程得到预测结果
准备好前馈该图片时网络的各类参数&＃xff08;重点*&＃xff09;
求attn_output_weigths以绘制各个head的注意力权重&＃xff08;重点*&＃xff09;
画图

在介绍具体的代码之前&＃xff0c;有几个重要的变量解释如下&＃xff1a;

变量名	含义	Shape
conv_features	Backbone最后一层特征图	[1,2048,25,34]
enc_attn_weights	编码器最后一层的self_attn weights	[1,850,850]
dec_attn_weights	解码器最后一层的cross_attn weights	[1,100,850]
memory	编码器的输出/解码器的输入特征	[850,1,256]
cq	解码器最后一层self_attn的输出	[100,1,256]
pk	位置编码	[1,256,25,34]
pq	训练好的object queries&＃xff0c;即query_embed	[100,256]
in_proj_weight	解码器最后一层cross_attn中q和k的线性权重	[768,256]
in_proj_bias	解码器最后一层cross_attn中q和k的偏置	[768]

每个步骤的代码如下&＃xff1a;

0. 准备工作

import warnings warnings.filterwarnings("ignore") from PIL import Image import requests import matplotlib.pyplot as pltimport torch import torchvision.transforms as T from torch.nn.functional import linear,softmax torch.set_grad_enabled(False)def box_cxcywh_to_xyxy(x):x_c, y_c, w, h &＃61; x.unbind(1)b &＃61; [(x_c - 0.5 * w), (y_c - 0.5 * h),(x_c &＃43; 0.5 * w), (y_c &＃43; 0.5 * h)]return torch.stack(b, dim&＃61;1)def rescale_bboxes(out_bbox, size):img_w, img_h &＃61; sizeb &＃61; box_cxcywh_to_xyxy(out_bbox)b &＃61; b * torch.tensor([img_w, img_h, img_w, img_h], dtype&＃61;torch.float32)return b# COCO classes CLASSES &＃61; [&＃39;N/A&＃39;, &＃39;person&＃39;, &＃39;bicycle&＃39;, &＃39;car&＃39;, &＃39;motorcycle&＃39;, &＃39;airplane&＃39;, &＃39;bus&＃39;,&＃39;train&＃39;, &＃39;truck&＃39;, &＃39;boat&＃39;, &＃39;traffic light&＃39;, &＃39;fire hydrant&＃39;, &＃39;N/A&＃39;,&＃39;stop sign&＃39;, &＃39;parking meter&＃39;, &＃39;bench&＃39;, &＃39;bird&＃39;, &＃39;cat&＃39;, &＃39;dog&＃39;, &＃39;horse&＃39;,&＃39;sheep&＃39;, &＃39;cow&＃39;, &＃39;elephant&＃39;, &＃39;bear&＃39;, &＃39;zebra&＃39;, &＃39;giraffe&＃39;, &＃39;N/A&＃39;, &＃39;backpack&＃39;,&＃39;umbrella&＃39;, &＃39;N/A&＃39;, &＃39;N/A&＃39;, &＃39;handbag&＃39;, &＃39;tie&＃39;, &＃39;suitcase&＃39;, &＃39;frisbee&＃39;, &＃39;skis&＃39;,&＃39;snowboard&＃39;, &＃39;sports ball&＃39;, &＃39;kite&＃39;, &＃39;baseball bat&＃39;, &＃39;baseball glove&＃39;,&＃39;skateboard&＃39;, &＃39;surfboard&＃39;, &＃39;tennis racket&＃39;, &＃39;bottle&＃39;, &＃39;N/A&＃39;, &＃39;wine glass&＃39;,&＃39;cup&＃39;, &＃39;fork&＃39;, &＃39;knife&＃39;, &＃39;spoon&＃39;, &＃39;bowl&＃39;, &＃39;banana&＃39;, &＃39;apple&＃39;, &＃39;sandwich&＃39;,&＃39;orange&＃39;, &＃39;broccoli&＃39;, &＃39;carrot&＃39;, &＃39;hot dog&＃39;, &＃39;pizza&＃39;, &＃39;donut&＃39;, &＃39;cake&＃39;,&＃39;chair&＃39;, &＃39;couch&＃39;, &＃39;potted plant&＃39;, &＃39;bed&＃39;, &＃39;N/A&＃39;, &＃39;dining table&＃39;, &＃39;N/A&＃39;,&＃39;N/A&＃39;, &＃39;toilet&＃39;, &＃39;N/A&＃39;, &＃39;tv&＃39;, &＃39;laptop&＃39;, &＃39;mouse&＃39;, &＃39;remote&＃39;, &＃39;keyboard&＃39;,&＃39;cell phone&＃39;, &＃39;microwave&＃39;, &＃39;oven&＃39;, &＃39;toaster&＃39;, &＃39;sink&＃39;, &＃39;refrigerator&＃39;, &＃39;N/A&＃39;,&＃39;book&＃39;, &＃39;clock&＃39;, &＃39;vase&＃39;, &＃39;scissors&＃39;, &＃39;teddy bear&＃39;, &＃39;hair drier&＃39;,&＃39;toothbrush&＃39; ] # colors for visualization COLORS &＃61; [[0.000, 0.447, 0.741], [0.850, 0.325, 0.098], [0.929, 0.694, 0.125],[0.494, 0.184, 0.556], [0.466, 0.674, 0.188], [0.301, 0.745, 0.933]]# standard PyTorch mean-std input image normalization transform &＃61; T.Compose([T.Resize(800),T.ToTensor(),T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ])

1. 加载DETR模型及获取训练好的参数

# ----------------------------------------------1. 加载模型及获取训练好的参数--------------------------------------------------- # 加载线上的模型 model &＃61; torch.hub.load(&＃39;facebookresearch/detr&＃39;, &＃39;detr_resnet50&＃39;, pretrained&＃61;True) model.eval() # 获取训练好的参数 for name, parameters in model.named_parameters():# 获取训练好的object queries&＃xff0c;即pq:[100,256]if name &＃61;&＃61; &＃39;query_embed.weight&＃39;:pq &＃61; parameters# 获取解码器的最后一层的交叉注意力模块中q和k的线性权重和偏置:[256*3,256]&＃xff0c;[768]if name &＃61;&＃61; &＃39;transformer.decoder.layers.5.multihead_attn.in_proj_weight&＃39;:in_proj_weight &＃61; parametersif name &＃61;&＃61; &＃39;transformer.decoder.layers.5.multihead_attn.in_proj_bias&＃39;:in_proj_bias &＃61; parameters

2. 下载待检测的图片并进行预处理和前馈过程得到预测结果

# --------------------------------------------2.下载图像并进行预处理和前馈过程-------------------------------------------------- # 线上下载图像 url &＃61; &＃39;http://images.cocodataset.org/val2017/000000039769.jpg&＃39; im &＃61; Image.open(requests.get(url, stream&＃61;True).raw) # img_path &＃61; &＃39;/home/wujian/000000039769.jpg&＃39; # im &＃61; Image.open(img_path)# mean-std normalize the input image (batch-size: 1) img &＃61; transform(im).unsqueeze(0)# propagate through the model outputs &＃61; model(img)# keep only predictions with 0.7&＃43; confidence probas &＃61; outputs[&＃39;pred_logits&＃39;].softmax(-1)[0, :, :-1] keep &＃61; probas.max(-1).values > 0.9# convert boxes from [0; 1] to image scales bboxes_scaled &＃61; rescale_bboxes(outputs[&＃39;pred_boxes&＃39;][0, keep], im.size)

3. 准备好前馈该图片时网络的各类参数&＃xff08;重点*&＃xff09;

# ------------------------------------------------3. 准备存储前馈该图片时的值--------------------------------------------------- # use lists to store the outputs via up-values conv_features, enc_attn_weights, dec_attn_weights &＃61; [], [], [] cq &＃61; [] # 存储detr中的 cq pk &＃61; [] # 存储detr中的 encoder pos memory &＃61; [] # 编码器最后一层的输入/解码器的输入特征# 注册hook hooks &＃61; [# 获取resnet最后一层特征图model.backbone[-2].register_forward_hook(lambda self, input, output: conv_features.append(output)),# 获取encoder的图像特征图memorymodel.transformer.encoder.register_forward_hook(lambda self, input, output: memory.append(output)),# 获取encoder的最后一层layer的self-attn weightsmodel.transformer.encoder.layers[-1].self_attn.register_forward_hook(lambda self, input, output: enc_attn_weights.append(output[1])),# 获取decoder的最后一层layer中交叉注意力的 weightsmodel.transformer.decoder.layers[-1].multihead_attn.register_forward_hook(lambda self, input, output: dec_attn_weights.append(output[1])),# 获取decoder最后一层self-attn的输出cqmodel.transformer.decoder.layers[-1].norm1.register_forward_hook(lambda self, input, output: cq.append(output)),# 获取图像特征图的位置编码pkmodel.backbone[-1].register_forward_hook(lambda self, input, output: pk.append(output)), ]# propagate through the model outputs &＃61; model(img)# 用完的hook后删除 for hook in hooks:hook.remove()# don&＃39;t need the list anymore conv_features &＃61; conv_features[0] # [1,2048,25,34] enc_attn_weights &＃61; enc_attn_weights[0] # [1,850,850] : [N,L,S] dec_attn_weights &＃61; dec_attn_weights[0] # [1,100,850] : [N,L,S] --> [batch, tgt_len, src_len] memory &＃61; memory[0] # [850,1,256] # 编码器最后一层的输入/解码器的输入特征cq &＃61; cq[0] # decoder的self_attn:最后一层输出[100,1,256] pk &＃61; pk[0] # [1,256,25,34]

4. 求attn_output_weigths以绘制各个head的注意力权重&＃xff08;重点*&＃xff09;

这里求attn_output_weigths的关键步骤为&＃xff1a;

q&＃61;cq&＃43;pq

k&＃61;pk

q&＃61;linear(q, in_proj_weight, in_proj_bias)

k&＃61;linear(k, in_proj_weight, in_proj_bias)

attn_ouput_weights&＃61;torch.bmm(q,k) #[1,8,100,850]分别为8个head的注意力值

# ----------------------------------------4&＃xff0c; 求attn_output_weights以绘制各个head的注意力权重------------------------------------ pk &＃61; pk.flatten(-2).permute(2,0,1) # [1,256,850] --> [850,1,256] pq &＃61; pq.unsqueeze(1).repeat(1,1,1) # [100,1,256] q &＃61; pq &＃43; cqk &＃61; pk# 将q和k完成线性层的映射&＃xff0c;代码参考自nn.MultiHeadAttn() _b &＃61; in_proj_bias _start &＃61; 0 _end &＃61; 256 _w &＃61; in_proj_weight[_start:_end, :] if _b is not None:_b &＃61; _b[_start:_end] q &＃61; linear(q, _w, _b)_b &＃61; in_proj_bias _start &＃61; 256 _end &＃61; 256 * 2 _w &＃61; in_proj_weight[_start:_end, :] if _b is not None:_b &＃61; _b[_start:_end] k &＃61; linear(k, _w, _b)scaling &＃61; float(256) ** -0.5 q &＃61; q * scaling q &＃61; q.contiguous().view(100, 8, 32).transpose(0, 1) k &＃61; k.contiguous().view(-1, 8, 32).transpose(0, 1) attn_output_weights &＃61; torch.bmm(q, k.transpose(1, 2))attn_output_weights &＃61; attn_output_weights.view(1, 8, 100, 850) attn_output_weights &＃61; attn_output_weights.view(1 * 8, 100, 850) attn_output_weights &＃61; softmax(attn_output_weights, dim&＃61;-1) attn_output_weights &＃61; attn_output_weights.view(1, 8, 100, 850)# 后续可视化各个头 attn_every_heads &＃61; attn_output_weights # [1,8,100,850] attn_output_weights &＃61; attn_output_weights.sum(dim&＃61;1) / 8 # [1,100,850]

5. 画图

# ----------------------------------------------------------5. 画图--------------------------------------------------------- h, w &＃61; conv_features[&＃39;0&＃39;].tensors.shape[-2:]fig, axs &＃61; plt.subplots(ncols&＃61;len(bboxes_scaled), nrows&＃61;10, figsize&＃61;(22, 28)) # [11,2] colors &＃61; COLORS * 100# 可视化 for idx, ax_i, (xmin, ymin, xmax, ymax) in zip(keep.nonzero(), axs.T, bboxes_scaled):# 可视化decoder的注意力权重ax &＃61; ax_i[0]ax.imshow(dec_attn_weights[0, idx].view(h, w))ax.axis(&＃39;off&＃39;)ax.set_title(f&＃39;query id: {idx.item()}&＃39;,fontsize &＃61; 30)# 可视化框和类别ax &＃61; ax_i[1]ax.imshow(im)ax.add_patch(plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin,fill&＃61;False, color&＃61;&＃39;blue&＃39;, linewidth&＃61;3))ax.axis(&＃39;off&＃39;)ax.set_title(CLASSES[probas[idx].argmax()],fontsize &＃61; 30)# 分别可视化8个头部的位置特征图for head in range(2, 2 &＃43; 8):ax &＃61; ax_i[head]ax.imshow(attn_every_heads[0, head-2, idx].view(h,w))ax.axis(&＃39;off&＃39;)ax.set_title(f&＃39;head:{head-2}&＃39;,fontsize &＃61; 30)fig.tight_layout() # 自动调整子图来使其填充整个画布 plt.show()

[注]&＃xff1a;以上代码来自网络

可视化结果&＃xff1a;

其中第一行的图就是用dec_attn_weights画出来的

下面是8个head的可视化结果图&＃xff0c;由attn_ouput_weights绘制

推荐阅读

replace
MooTools和JQuery并排 - MooTools and JQuery Side by Side

IjustinheritedsomewebpageswhichusesMooTools.IneverusedMooTools.NowIneedtoaddsomef ... [详细]

蜡笔小新 2023-12-12 13:43:58
python
第四章高阶函数（参数传递、高阶函数、lambda表达式）（python进阶）的讲解和应用

本文主要讲解了第四章高阶函数（参数传递、高阶函数、lambda表达式）的相关知识，包括函数参数传递机制和赋值机制、引用传递的概念和应用、默认参数的定义和使用等内容。同时介绍了高阶函数和lambda表达式的概念，并给出了一些实例代码进行演示。对于想要进一步提升python编程能力的读者来说，本文将是一个不错的学习资料。 ... [详细]

蜡笔小新 2023-12-12 15:52:48
python
基于dlib的人脸68特征点提取(眨眼张嘴检测)python版本

文章目录引言开发环境和库流程设计张嘴和闭眼的检测引言(1)利用Dlib官方训练好的模型“shape_predictor_68_face_landmarks.dat”进行68个点标定 ... [详细]

蜡笔小新 2023-12-12 13:27:42
replace
vue使用

关键词： ... [详细]

蜡笔小新 2023-12-14 19:14:56
replace
sklearn数据集库中的常用数据集类型介绍

本文介绍了sklearn数据集库中常用的数据集类型，包括玩具数据集和样本生成器。其中详细介绍了波士顿房价数据集，包含了波士顿506处房屋的13种不同特征以及房屋价格，适用于回归任务。 ... [详细]

蜡笔小新 2023-12-13 17:45:15
python
计算机网络初识及通信流程分析

本文介绍了计算机网络的定义和通信流程，包括客户端编译文件、二进制转换、三层路由设备等。同时，还介绍了计算机网络中常用的关键词，如MAC地址和IP地址。 ... [详细]

蜡笔小新 2023-12-13 16:50:29
python
python创建一个窗口_等一个大佬啊要求用python创建一个窗口，窗口按钮功能是创建一个球体或立方体。明天上课之前交给我...

展开全部下面的代码是创建一个立方体Thisexamplescreatesanddisplaysasimplebox.#Thefirstlineloadstheinit_disp ... [详细]

蜡笔小新 2023-12-13 16:26:09
object
不同优化算法的比较分析及实验验证

本文介绍了神经网络优化中常用的优化方法，包括学习率调整和梯度估计修正，并通过实验验证了不同优化算法的效果。实验结果表明，Adam算法在综合考虑学习率调整和梯度估计修正方面表现较好。该研究对于优化神经网络的训练过程具有指导意义。 ... [详细]

蜡笔小新 2023-12-13 16:05:14
python
Python瓦片图下载、合并、绘图、标记的代码示例

本文提供了Python瓦片图下载、合并、绘图、标记的代码示例，包括下载代码、多线程下载、图像处理等功能。通过参考geoserver，使用PIL、cv2、numpy、gdal、osr等库实现了瓦片图的下载、合并、绘图和标记功能。代码示例详细介绍了各个功能的实现方法，供读者参考使用。 ... [详细]

蜡笔小新 2023-12-13 12:14:55
python
使用正则表达式爬取36Kr网站首页新闻的操作步骤和代码示例

本文介绍了使用正则表达式来爬取36Kr网站首页所有新闻的操作步骤和代码示例。通过访问网站、查找关键词、编写代码等步骤，可以获取到网站首页的新闻数据。代码示例使用Python编写，并使用正则表达式来提取所需的数据。详细的操作步骤和代码示例可以参考本文内容。 ... [详细]

蜡笔小新 2023-12-12 19:16:21
python
Python爬虫技术基础篇面向对象高级编程（中）的多重继承

本文介绍了Python爬虫技术基础篇面向对象高级编程（中）中的多重继承概念。通过继承，子类可以扩展父类的功能。文章以动物类层次的设计为例，讨论了按照不同分类方式设计类层次的复杂性和多重继承的优势。最后给出了哺乳动物和鸟类的设计示例，以及能跑、能飞、宠物类和非宠物类的增加对类数量的影响。 ... [详细]

蜡笔小新 2023-12-12 16:19:02
request
django视图函数的使用方法

本文介绍了django中视图函数的使用方法，包括如何接收Web请求并返回Web响应，以及如何处理GET请求和POST请求。同时还介绍了urls.py和views.py文件的配置方式。 ... [详细]

蜡笔小新 2023-12-12 16:02:59
request
如何在codeigniter中识别angularjs请求

本文讨论了如何在codeigniter中识别来自angularjs的请求，并提供了两种方法的代码示例。作者尝试了$this->input->is_ajax_request()和自定义函数is_ajax()，但都没有成功。最后，作者展示了一个ajax请求的示例代码。 ... [详细]

蜡笔小新 2023-12-12 12:37:07
python
python限制递归次数（python最大公约数递归）

本文目录一览：1、python为什么要进行递归限制 ... [详细]

蜡笔小新 2023-12-11 17:39:02
const
Explain如何助力SQL语句的优化及其分析方法

本文介绍了Explain如何助力SQL语句的优化以及分析方法。Explain是一个数据库SQL语句的模拟器，通过对SQL语句的模拟返回一个性能分析表，从而帮助工程师了解程序运行缓慢的原因。文章还介绍了Explain运行方法以及如何分析Explain表格中各个字段的含义。MySQL 5.5开始支持Explain功能，但仅限于select语句，而MySQL 5.7逐渐支持对update、delete和insert语句的模拟和分析。 ... [详细]

蜡笔小新 2023-12-10 21:57:15

mobiledu2502898543

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章