当前位置: 开发笔记 > 编程语言 > 正文

「干货」用YOLOv5模型识别出表情！

作者：周周微商互联 | 来源：互联网 | 2023-07-07 11:13

文章来自：DataWhale。作者：闫永强，算法工程师。本文利用YOLOV5对手势进行训练识别，并识别显示出对应的emoji

文章来自&＃xff1a;DataWhale。

作者&＃xff1a;闫永强&＃xff0c;算法工程师。

本文利用YOLOV5对手势进行训练识别&＃xff0c;并识别显示出对应的emoji&＃xff0c;如同下图&＃xff1a;

本文整体思路如下。提示&＃xff1a;本文含完整实践代码&＃xff0c;代码较长&＃xff0c;建议先看文字部分的实践思路&＃xff0c;代码先马后看

一、YOLOV5训练数据集

1. 安装环境依赖

本教程所用环境&＃xff1a;YOLOV5版本是V3.1。

通过git clone 将源码下载到本地&＃xff0c;通过pip install -r requirements.txt 安装依赖包 &＃xff08;其中官方要求python>&＃61;3.8 and torch>&＃61;1.6&＃xff09;。

我的环境是&＃xff1a;系统环境Ubuntu16.04&＃xff1b;cuda版本10.2&＃xff1b;cudnn版本7.6.5&＃xff1b;torch版本1.6.0&＃xff1b;python版本3.8

2. 准备手势识别数据集

其中手势数据集已上传至开源数据平台Graviti&＃xff0c;包含了完整代码。

手势数据集地址&＃xff1a;https://gas.graviti.cn/dataset/datawhale/HandPose?utm_medium&＃61;0831datawhale

注&＃xff1a;代码在数据地址的讨论区

2.1 数据集的采集以及标注

手势数据采集的代码&＃xff1a;

import cv2def main():total_pics &＃61; 1000cap &＃61; cv2.VideoCapture(0)pic_no &＃61; 0flag_start_capturing &＃61; Falseframes &＃61; 0while True:ret,frame &＃61; cap.read()frame &＃61; cv2.flip(frame,1)cv2.imwrite("hand_images/" &＃43;str(pic_no) &＃43;".jpg",frame)cv2.imshow("Capturing gesture",frame)cv2.waitKey(10)pic_no &＃43;&＃61; 1if pic_no &＃61;&＃61; total_pics:breakmain()

在yolov5目录下创建VOC2012文件夹&＃xff08;名字自己定义的&＃xff09;&＃xff0c;目录结构就是VOC数据集的&＃xff0c;对应如下&＃xff1a;

VOC2012../Annotations #这个是存放数据集图片对应的xml文件../images #这个存放图片的../ImageSets/Main #这个主要是存放train.txt&＃xff0c;test.txt&＃xff0c;val.txt和trainval.txt四个文件。里面的内容是训练集、测试集、验证集以及训练验证集的名字&＃xff08;不带扩展后缀名&＃xff09;。

示例&＃xff1a;

VOC2012文件夹下内容&＃xff1a;

Annotations文件中是xml文件&＃xff08;labelimg标注的&＃xff09;&＃xff1a;

images为VOC数据集格式中的JPRGImages&＃xff1a;

ImageSets文件中Main子文件夹主要存放训练&＃xff0c;测试验证集的划分txt。这个划分通过以下脚本代码生成&＃xff1a;

# coding:utf-8import os import random import argparseparser &＃61; argparse.ArgumentParser() #xml文件的地址&＃xff0c;根据自己的数据进行修改 xml一般存放在Annotations下 parser.add_argument(&＃39;--xml_path&＃39;, default&＃61;&＃39;C:\\Users\\Lenovo\\Desktop\\hand_datasets\\VOC2012\\Annotations\\&＃39;, type&＃61;str, help&＃61;&＃39;input xml label path&＃39;) #数据集的划分&＃xff0c;地址选择自己数据下的ImageSets/Main parser.add_argument(&＃39;--txt_path&＃39;, default&＃61;&＃39;C:\\Users\\Lenovo\\Desktop\\hand_datasets\\VOC2012\\ImageSets\\Main\\&＃39;, type&＃61;str, help&＃61;&＃39;output txt label path&＃39;) opt &＃61; parser.parse_args()trainval_percent &＃61; 1.0 train_percent &＃61; 0.99 xmlfilepath &＃61; opt.xml_path txtsavepath &＃61; opt.txt_path total_xml &＃61; os.listdir(xmlfilepath) if not os.path.exists(txtsavepath):os.makedirs(txtsavepath)num &＃61; len(total_xml) list_index &＃61; range(num) tv &＃61; int(num * trainval_percent) tr &＃61; int(tv * train_percent) trainval &＃61; random.sample(list_index, tv) train &＃61; random.sample(trainval, tr)file_trainval &＃61; open(txtsavepath &＃43; &＃39;trainval.txt&＃39;, &＃39;w&＃39;) file_test &＃61; open(txtsavepath &＃43; &＃39;test.txt&＃39;, &＃39;w&＃39;) file_train &＃61; open(txtsavepath &＃43; &＃39;train.txt&＃39;, &＃39;w&＃39;) file_val &＃61; open(txtsavepath &＃43; &＃39;val.txt&＃39;, &＃39;w&＃39;)for i in list_index:name &＃61; total_xml[i][:-4] &＃43; &＃39;\n&＃39;if i in trainval:file_trainval.write(name)if i in train:file_train.write(name)else:file_val.write(name)else:file_test.write(name)file_trainval.close() file_train.close() file_val.close() file_test.close()

运行代码在Main文件下生成txt文档如下&＃xff1a;

2.2 生成yolo训练格式labels

把xml标注信息转换成yolo的txt格式。其中yolo的txt标签格式信息&＃xff1a;每个图像对应一个txt文件&＃xff0c;文件每一行为一个目标信息&＃xff0c;包括classx_center, y_center, width, height 格式。如下图所示&＃xff1a;

创建voc_label.py文件&＃xff0c;将训练集&＃xff0c;验证集以及测试集生成txt标签&＃xff0c;代码如下&＃xff1a;

# -*- coding: utf-8 -*- import xml.etree.ElementTree as ET import os from os import getcwdsets &＃61; [&＃39;train&＃39;, &＃39;val&＃39;, &＃39;test&＃39;] classes &＃61; ["four_fingers","hand_with_fingers_splayed","index_pointing_up","little_finger","ok_hand","raised_fist","raised_hand","sign_of_the_horns","three","thumbup","victory_hand"] # 11 classes # 改成自己的类别 abs_path &＃61; os.getcwd() print(abs_path)def convert(size, box):dw &＃61; 1. / (size[0])dh &＃61; 1. / (size[1])x &＃61; (box[0] &＃43; box[1]) / 2.0 - 1y &＃61; (box[2] &＃43; box[3]) / 2.0 - 1w &＃61; box[1] - box[0]h &＃61; box[3] - box[2]x &＃61; x * dww &＃61; w * dwy &＃61; y * dhh &＃61; h * dhreturn x, y, w, hdef convert_annotation(image_id):in_file &＃61; open(&＃39;/home/yanyq/Ryan/yolov5/VOC2012/Annotations/%s.xml&＃39; % (image_id), encoding&＃61;&＃39;UTF-8&＃39;)out_file &＃61; open(&＃39;/home/yanyq/Ryan/yolov5/VOC2012/labels/%s.txt&＃39; % (image_id), &＃39;w&＃39;)tree &＃61; ET.parse(in_file)root &＃61; tree.getroot()size &＃61; root.find(&＃39;size&＃39;)w &＃61; int(size.find(&＃39;width&＃39;).text)h &＃61; int(size.find(&＃39;height&＃39;).text)for obj in root.iter(&＃39;object&＃39;):# difficult &＃61; obj.find(&＃39;difficult&＃39;).textdifficult &＃61; obj.find(&＃39;difficult&＃39;).textcls &＃61; obj.find(&＃39;name&＃39;).textif cls not in classes or int(difficult) &＃61;&＃61; 1:continuecls_id &＃61; classes.index(cls)xmlbox &＃61; obj.find(&＃39;bndbox&＃39;)b &＃61; (float(xmlbox.find(&＃39;xmin&＃39;).text), float(xmlbox.find(&＃39;xmax&＃39;).text), float(xmlbox.find(&＃39;ymin&＃39;).text),float(xmlbox.find(&＃39;ymax&＃39;).text))b1, b2, b3, b4 &＃61; b# 标注越界修正if b2 > w:b2 &＃61; wif b4 > h:b4 &＃61; hb &＃61; (b1, b2, b3, b4)bb &＃61; convert((w, h), b)out_file.write(str(cls_id) &＃43; " " &＃43; " ".join([str(a) for a in bb]) &＃43; &＃39;\n&＃39;)wd &＃61; getcwd() for image_set in sets:if not os.path.exists(&＃39;/home/yanyq/Ryan/yolov5/VOC2012/labels/&＃39;):os.makedirs(&＃39;/home/yanyq/Ryan/yolov5/VOC2012/labels/&＃39;)image_ids &＃61; open(&＃39;/home/yanyq/Ryan/yolov5/VOC2012/ImageSets/Main/%s.txt&＃39; % (image_set)).read().strip().split()list_file &＃61; open(&＃39;%s.txt&＃39; % (image_set), &＃39;w&＃39;)for image_id in image_ids:list_file.write(abs_path &＃43; &＃39;/images/%s.jpg\n&＃39; % (image_id))convert_annotation(image_id)list_file.close()

运行上述脚本后会生成labels文件夹和三个包含数据集的txt文件&＃xff0c;其中labels中为图像的yolo格式标注文件&＃xff0c;train.txt&＃xff0c;test.txt, val.txt文件为划分后图像所在位置的绝对路径。

三个txt文件内容:

2.3 配置文件

1&＃xff09;数据集的配置

在yolov5目录的data文件夹新建一个Emoji.yaml文件&＃xff08;自己定义&＃xff09;。用来存放训练集验证集的划分文件train.txt和val.txt&＃xff08;其中这两个文件是voc_label.py生成的&＃xff09;。具体内容如下&＃xff1a;

2&＃xff09;模型的配置文件

一般训练yolo模型的时候&＃xff0c;是可以聚类自己标注的框作为先验框&＃xff08;这样可以保证标注样本最大化的利用&＃xff09;。我们这里就直接采用默认值了。

选择一个需要的模型&＃xff0c;YOLOV5有提供s、m、l、x版本&＃xff0c;其是逐渐增大的架构&＃xff0c;也就是训练时间和推理时间都对应增加&＃xff0c;我们这里选择s版本。在yolov5文件夹下的models文件夹中打开yolov5s.yaml文件&＃xff0c;修改内容如下图&＃xff08;我们选择默认anchor&＃xff0c;所以不做修改&＃xff0c;只需要更改nc中的类别数&＃xff0c;由于我们是11类&＃xff0c;所以改成11就可以了&＃xff09;&＃xff1a;

到这里我们的自定义数据集以及配置文件创建完毕&＃xff0c;下面就是训练模型了。

3.模型训练

3.1、下载预训练模型

在源码yolov5目录下的weights文件夹下提供了下载smlx模型的脚本--download_weights.sh&＃xff0c;执行这个脚本就可以下载这四个模型的预训练模型了。

3.2、训练模型

以上参数解释如下&＃xff1a;epochs&＃xff1a;指的就是训练过程中整个数据集将被迭代多少次,显卡不行你就调小点。batch-size&＃xff1a;一次看完多少张图片才进行权重更新&＃xff0c;梯度下降的mini-batch,显卡不行你就调小点。cfg&＃xff1a;存储模型结构的配置文件。data&＃xff1a;存储训练、测试数据的文件。img-size&＃xff1a;输入图片宽高,显卡不行你就……。rect&＃xff1a;进行矩形训练。resume&＃xff1a;恢复最近保存的模型开始训练。nosave&＃xff1a;仅保存最终checkpoint。notest&＃xff1a;仅测试最后的epoch。evolve&＃xff1a;进化超参数。bucket&＃xff1a;gsutil bucket。 cache-images&＃xff1a;缓存图像以加快训练速度。 weights&＃xff1a;权重文件路径。name&＃xff1a;重命名results.txt to results_name.txt。device&＃xff1a;cuda device, i.e. 0 or 0,1,2,3 or cpu。adam&＃xff1a;使用adam优化。multi-scale&＃xff1a;多尺度训练&＃xff0c;img-size &＃43;/- 50%。single-cls&＃xff1a;单类别的训练集

训练只需要运行训练命令就可以了&＃xff0c;如下&＃xff1a;

$ python train.py --data Emoji.yaml --cfg yolov5s.yaml --weights weights/yolov5s.pt --batch-size 64 --device "0,1,2,3" --epochs 200 --img-size 640

其中device batch-size 等需要根据自己机器进行设置。

4.模型测试

评估模型好坏就是在有标注的测试集或验证集上进行模型效果的评估&＃xff0c;在目标检测中最常使用的评估指标为mAP。yolov5文件下的test.py文件中指定了数据集的配置文件和训练结果模型如下&＃xff1a;

通过以下命令进行模型测试&＃xff1a;

python test.py --data data/Emoji.yaml --weights runs/train/exp2/weights/best.pt --augment

模型测试效果&＃xff1a;

测试结果图&＃xff1a;

二、YOLOV5模型转换

1.安装依赖库

pip install onnx coremltools onnx-simplifier

2.导出ONNX模型

python models/export.py --weights runs/train/exp2/weights/best.pt --img 640 --batch 1

此时在best.pt同级目录下生成了best.mlmodel best.onnx best.torchscript.pt三个文件&＃xff0c;我们只需best.onnx&＃xff0c;这个文件可以直接用netron打开查看模型结构。

3.用onnx-simplifer简化模型

为什么要简化&＃xff1f;

在训练完深度学习的pytorch或者tensorflow模型后&＃xff0c;有时候需要把模型转成 onnx&＃xff0c;但是很多时候&＃xff0c;很多节点比如cast节点&＃xff0c;Identity 这些节点可能都不需要&＃xff0c;我们需要进行简化&＃xff0c;这样会方便我们把模型转成ncnn或者mnn等这些端侧部署的模型格式或者通过tensorRT进行部署。

python -m onnxsim best.onnx yolov5-best-sim.onnx

完成后就生成了简化版本的模型yolov5-best-sim.onnx。

三、YOLOV5转换成ncnn模型

1、onnx转.param .bin

由上述生成了yolov5-best-sim.onnx这个模型&＃xff0c;我们利用ncnn自带的工具onnx2ncnn.exe&＃xff08;这个工具是自己编译生成的&＃xff0c;我这里是在windows下编译生成的&＃xff0c;可以用linux下的可执行文件&＃xff09;生成yolov5s.param yolov5s.bin两个文件。

在windows平台下ctrl&＃43;r cmd命令行窗口输入&＃xff1a;

onnx2ncnn.exe yolov5-best-sim.onnx yolov5s.param yolov5s.bin

转换的过程中会出现上图所示的ncnn不支持层&＃xff0c;下边就是要修改param文件&＃xff0c;把不支持层改成支持层。

2、修改.param 参数去除不支持的网络层

去掉不支持的网络层&＃xff0c;打开转换得到的yolov5s.param文件&＃xff0c;前面几行需要删除的是标红部分。&＃xff08;注意我们训练yoloV5的版本是V3.1&＃xff0c;这里不同的版本可能会不同。&＃xff09;

修改结果如下绿色框和红色框中的。因为去掉了10层所以变成191 228。并用YoloV5Focus网络层代替去掉的10层&＃xff0c;而YoloV5Focus网络层中的images代表该层的输入&＃xff0c;207代表的输出名&＃xff0c;这个是根据下边一层的卷积层输入层数写的。

修改网路的输出shape&＃xff1a;

当基于修改后的网路使用ncnn/examples/yolov5测试时会发现出现图片中一堆乱框&＃xff0c;这种情况需要修改网路的输出部分。在保证输出名一致的情况下&＃xff0c;修改Reshape中的0&＃61;-1,使的最终的输出shape不固定。具体的修改地方以及修改之前和之后见下图。

3、ncnn的c&＃43;&＃43;测试代码实现

以下是用C&＃43;&＃43;实现的完整代码。建议一划到底&＃xff0c;先看最后的整体思路

#include #include #include "iostream" //#include //#include < ctime > //#include //#include // ncnn #include "ncnn/layer.h" #include "ncnn/net.h" #include "ncnn/benchmark.h" //#include "gpu.h"#include "opencv2/core/core.hpp" #include "opencv2/highgui/highgui.hpp" #include #include "opencv2/opencv.hpp" using namespace std; using namespace cv;static ncnn::UnlockedPoolAllocator g_blob_pool_allocator; static ncnn::PoolAllocator g_workspace_pool_allocator;static ncnn::Net yolov5;class YoloV5Focus : public ncnn::Layer { public:YoloV5Focus(){one_blob_only &＃61; true;}virtual int forward(const ncnn::Mat& bottom_blob, ncnn::Mat& top_blob, const ncnn::Option& opt) const{int w &＃61; bottom_blob.w;int h &＃61; bottom_blob.h;int channels &＃61; bottom_blob.c;int outw &＃61; w / 2;int outh &＃61; h / 2;int outc &＃61; channels * 4;top_blob.create(outw, outh, outc, 4u, 1, opt.blob_allocator);if (top_blob.empty())return -100;#pragma omp parallel for num_threads(opt.num_threads)for (int p &＃61; 0; p < outc; p&＃43;&＃43;){const float* ptr &＃61; bottom_blob.channel(p % channels).row((p / channels) % 2) &＃43; ((p / channels) / 2);float* outptr &＃61; top_blob.channel(p);for (int i &＃61; 0; i < outh; i&＃43;&＃43;){for (int j &＃61; 0; j < outw; j&＃43;&＃43;){*outptr &＃61; *ptr;outptr &＃43;&＃61; 1;ptr &＃43;&＃61; 2;}ptr &＃43;&＃61; w;}}return 0;} }; DEFINE_LAYER_CREATOR(YoloV5Focus)struct Object {float x;float y;float w;float h;int label;float prob; };static inline float interp_area(const Object& a, const Object& b) {if (a.x > b.x &＃43; b.w || a.x &＃43; a.w < b.x || a.y > b.y &＃43; b.h || a.y &＃43; a.h < b.y){// no interpreturn 0.f;}float inter_width &＃61; std::min(a.x &＃43; a.w, b.x &＃43; b.w) - std::max(a.x, b.x);float inter_height &＃61; std::min(a.y &＃43; a.h, b.y &＃43; b.h) - std::max(a.y, b.y);return inter_width * inter_height; }static void qsort_descent_inplace(std::vector& faceobjects, int left, int right) {int i &＃61; left;int j &＃61; right;float p &＃61; faceobjects[(left &＃43; right) / 2].prob;while (i <&＃61; j){while (faceobjects[i].prob > p)i&＃43;&＃43;;while (faceobjects[j].prob < p)j--;if (i <&＃61; j){// swapstd::swap(faceobjects[i], faceobjects[j]);i&＃43;&＃43;;j--;}}#pragma omp parallel ps{ #pragma omp p{if (left < j) qsort_descent_inplace(faceobjects, left, j);} #pragma omp p{if (i < right) qsort_descent_inplace(faceobjects, i, right);}} }static void qsort_descent_inplace(std::vector& faceobjects) {if (faceobjects.empty())return;qsort_descent_inplace(faceobjects, 0, faceobjects.size() - 1); }static void nms_sorted_bboxes(const std::vector& faceobjects, std::vector& picked, float nms_threshold) {picked.clear();const int n &＃61; faceobjects.size();std::vector areas(n);for (int i &＃61; 0; i < n; i&＃43;&＃43;){areas[i] &＃61; faceobjects[i].w * faceobjects[i].h;}for (int i &＃61; 0; i < n; i&＃43;&＃43;){const Object& a &＃61; faceobjects[i];int keep &＃61; 1;for (int j &＃61; 0; j < (int)picked.size(); j&＃43;&＃43;){const Object& b &＃61; faceobjects[picked[j]];// interp over unionfloat inter_area &＃61; interp_area(a, b);float union_area &＃61; areas[i] &＃43; areas[picked[j]] - inter_area;// float IoU &＃61; inter_area / union_areaif (inter_area / union_area > nms_threshold)keep &＃61; 0;}if (keep)picked.push_back(i);} }static inline float sigmoid(float x) {return static_cast(1.f / (1.f &＃43; exp(-x))); }static void generate_proposals(const ncnn::Mat& anchors, int stride, const ncnn::Mat& in_pad, const ncnn::Mat& feat_blob, float prob_threshold, std::vector& objects) {const int num_grid &＃61; feat_blob.h;int num_grid_x;int num_grid_y;if (in_pad.w > in_pad.h){num_grid_x &＃61; in_pad.w / stride;num_grid_y &＃61; num_grid / num_grid_x;}else{num_grid_y &＃61; in_pad.h / stride;num_grid_x &＃61; num_grid / num_grid_y;}const int num_class &＃61; feat_blob.w - 5;const int num_anchors &＃61; anchors.w / 2;for (int q &＃61; 0; q < num_anchors; q&＃43;&＃43;){const float anchor_w &＃61; anchors[q * 2];const float anchor_h &＃61; anchors[q * 2 &＃43; 1];const ncnn::Mat feat &＃61; feat_blob.channel(q);for (int i &＃61; 0; i < num_grid_y; i&＃43;&＃43;){for (int j &＃61; 0; j < num_grid_x; j&＃43;&＃43;){const float* featptr &＃61; feat.row(i * num_grid_x &＃43; j);// find class index with max class scoreint class_index &＃61; 0;float class_score &＃61; -FLT_MAX;for (int k &＃61; 0; k < num_class; k&＃43;&＃43;){float score &＃61; featptr[5 &＃43; k];if (score > class_score){class_index &＃61; k;class_score &＃61; score;}}float box_score &＃61; featptr[4];float confidence &＃61; sigmoid(box_score) * sigmoid(class_score);if (confidence >&＃61; prob_threshold){float dx &＃61; sigmoid(featptr[0]);float dy &＃61; sigmoid(featptr[1]);float dw &＃61; sigmoid(featptr[2]);float dh &＃61; sigmoid(featptr[3]);float pb_cx &＃61; (dx * 2.f - 0.5f &＃43; j) * stride;float pb_cy &＃61; (dy * 2.f - 0.5f &＃43; i) * stride;float pb_w &＃61; pow(dw * 2.f, 2) * anchor_w;float pb_h &＃61; pow(dh * 2.f, 2) * anchor_h;float x0 &＃61; pb_cx - pb_w * 0.5f;float y0 &＃61; pb_cy - pb_h * 0.5f;float x1 &＃61; pb_cx &＃43; pb_w * 0.5f;float y1 &＃61; pb_cy &＃43; pb_h * 0.5f;Object obj;obj.x &＃61; x0;obj.y &＃61; y0;obj.w &＃61; x1 - x0;obj.h &＃61; y1 - y0;obj.label &＃61; class_index;obj.prob &＃61; confidence;objects.push_back(obj);}}}} }extern "C" {void release(){fprintf(stderr, "YoloV5Ncnn finished!");//ncnn::destroy_gpu_instance();}int init(){fprintf(stderr, "YoloV5Ncnn init!\n");ncnn::Option opt;opt.lightmode &＃61; true;opt.num_threads &＃61; 4;opt.blob_allocator &＃61; &g_blob_pool_allocator;opt.workspace_allocator &＃61; &g_workspace_pool_allocator;opt.use_packing_layout &＃61; true;yolov5.opt &＃61; opt;yolov5.register_custom_layer("YoloV5Focus", YoloV5Focus_layer_creator);// init param{int ret &＃61; yolov5.load_param("yolov5s.param"); if (ret !&＃61; 0){std::cout << "ret&＃61; " << ret << std::endl;fprintf(stderr, "YoloV5Ncnn, load_param failed");return -301;}}// init bin{int ret &＃61; yolov5.load_model("yolov5s.bin"); if (ret !&＃61; 0){fprintf(stderr, "YoloV5Ncnn, load_model failed");return -301;}}return 0;}int detect(cv::Mat img, std::vector &objects){double start_time &＃61; ncnn::get_current_time();const int target_size &＃61; 320;// letterbox pad to multiple of 32const int width &＃61; img.cols;//1280const int height &＃61; img.rows;//720int w &＃61; img.cols;//1280int h &＃61; img.rows;//720float scale &＃61; 1.f;if (w > h){scale &＃61; (float)target_size / w;//640/1280w &＃61; target_size;//640h &＃61; h * scale;//360}else{scale &＃61; (float)target_size / h;h &＃61; target_size;w &＃61; w * scale;}cv::resize(img, img, cv::Size(w, h));ncnn::Mat in &＃61; ncnn::Mat::from_pixels(img.data, ncnn::Mat::PIXEL_BGR2RGB, w, h);// pad to target_size rectangle// yolov5/utils/datasets.py letterboxint wpad &＃61; (w &＃43; 31) / 32 * 32 - w;int hpad &＃61; (h &＃43; 31) / 32 * 32 - h;ncnn::Mat in_pad;ncnn::copy_make_border(in, in_pad, hpad / 2, hpad - hpad / 2, wpad / 2, wpad - wpad / 2, ncnn::BORDER_CONSTANT, 114.f);// yolov5//std::vector objects;{const float prob_threshold &＃61; 0.4f;const float nms_threshold &＃61; 0.51f;const float norm_vals[3] &＃61; { 1 / 255.f, 1 / 255.f, 1 / 255.f };in_pad.substract_mean_normalize(0, norm_vals);ncnn::Extractor ex &＃61; yolov5.create_extractor();//ex.set_vulkan_compute(use_gpu);ex.input("images", in_pad);std::vector proposals;// anchor setting from yolov5/models/yolov5s.yaml// stride 8{ncnn::Mat out;ex.extract("output", out);ncnn::Mat anchors(6);anchors[0] &＃61; 10.f;anchors[1] &＃61; 13.f;anchors[2] &＃61; 16.f;anchors[3] &＃61; 30.f;anchors[4] &＃61; 33.f;anchors[5] &＃61; 23.f;std::vector objects8;generate_proposals(anchors, 8, in_pad, out, prob_threshold, objects8);proposals.insert(proposals.end(), objects8.begin(), objects8.end());}// stride 16{ncnn::Mat out;ex.extract("771", out);ncnn::Mat anchors(6);anchors[0] &＃61; 30.f;anchors[1] &＃61; 61.f;anchors[2] &＃61; 62.f;anchors[3] &＃61; 45.f;anchors[4] &＃61; 59.f;anchors[5] &＃61; 119.f;std::vector objects16;generate_proposals(anchors, 16, in_pad, out, prob_threshold, objects16);proposals.insert(proposals.end(), objects16.begin(), objects16.end());}// stride 32{ncnn::Mat out;ex.extract("791", out);ncnn::Mat anchors(6);anchors[0] &＃61; 116.f;anchors[1] &＃61; 90.f;anchors[2] &＃61; 156.f;anchors[3] &＃61; 198.f;anchors[4] &＃61; 373.f;anchors[5] &＃61; 326.f;std::vector objects32;generate_proposals(anchors, 32, in_pad, out, prob_threshold, objects32);proposals.insert(proposals.end(), objects32.begin(), objects32.end());}// sort all proposals by score from highest to lowestqsort_descent_inplace(proposals);// apply nms with nms_thresholdstd::vector picked;nms_sorted_bboxes(proposals, picked, nms_threshold);int count &＃61; picked.size();objects.resize(count);for (int i &＃61; 0; i < count; i&＃43;&＃43;){objects[i] &＃61; proposals[picked[i]];// adjust offset to original unpaddedfloat x0 &＃61; (objects[i].x - (wpad / 2)) / scale;float y0 &＃61; (objects[i].y - (hpad / 2)) / scale;float x1 &＃61; (objects[i].x &＃43; objects[i].w - (wpad / 2)) / scale;float y1 &＃61; (objects[i].y &＃43; objects[i].h - (hpad / 2)) / scale;// clipx0 &＃61; std::max(std::min(x0, (float)(width - 1)), 0.f);y0 &＃61; std::max(std::min(y0, (float)(height - 1)), 0.f);x1 &＃61; std::max(std::min(x1, (float)(width - 1)), 0.f);y1 &＃61; std::max(std::min(y1, (float)(height - 1)), 0.f);objects[i].x &＃61; x0;objects[i].y &＃61; y0;objects[i].w &＃61; x1;objects[i].h &＃61; y1;}}return 0;} }static const char* class_names[] &＃61; {"four_fingers","hand_with_fingers_splayed","index_pointing_up","little_finger","ok_hand","raised_fist","raised_hand","sign_of_the_horns","three","thumbup","victory_hand" };void draw_face_box(cv::Mat& bgr, std::vector object) //主要的emoji显示函数 {for (int i &＃61; 0; i < object.size(); i&＃43;&＃43;){const auto obj &＃61; object[i];cv::rectangle(bgr, cv::Point(obj.x, obj.y), cv::Point(obj.w, obj.h), cv::Scalar(0, 255, 0), 3, 8, 0);std::cout << "label:" << class_names[obj.label] << std::endl;string emoji_path &＃61; "emoji\\" &＃43; string(class_names[obj.label]) &＃43; ".png"; //这个是emoji图片的路径cv::Mat logo &＃61; cv::imread(emoji_path);if (logo.empty()) {std::cout << "imread logo failed!!!" << std::endl;return;}resize(logo, logo, cv::Size(80, 80));cv::Mat imageROI &＃61; bgr(cv::Range(obj.x, obj.x &＃43; logo.rows), cv::Range(obj.y, obj.y &＃43; logo.cols)); //emoji的图片放在图中的位置&＃xff0c;也就是手势框的旁边logo.copyTo(imageROI); //把emoji放在原图中}}int main() {Mat frame;VideoCapture capture(0);init();while (true){capture >> frame; if (!frame.empty()) { std::vector objects;detect(frame, objects);draw_face_box(frame, objects);imshow("window", frame); }if (waitKey(20) &＃61;&＃61; &＃39;q&＃39;) break;}capture.release(); return 0; }

这里是首先用yolov5s识别出手势&＃xff0c;然后利用图像ROI融合&＃xff0c;把相应的Emoji缩放到80x80大小显示在手势框的旁边&＃xff0c;实现根据不同的手势显示相应的Emoji。

4、实现emoji和手势的映射

到这里&＃xff0c;我们终于大功告成&＃xff01;达成了开头的效果。

胜利✌️

觉得还不错就给我一个小小的鼓励吧&＃xff01;

推荐阅读

上传
clone的fork与pthread_create创建线程有何不同

本文讨论了clone的fork与pthread_create创建线程的不同之处。进程是一个指令执行流及其执行环境，其执行环境是一个系统资源的集合。在调用系统调用fork创建一个进程时，子进程只是完全复制父进程的资源，这样得到的子进程独立于父进程，具有良好的并发性。但是二者之间的通讯需要通过专门的通讯机制，另外通过fork创建子进程系统开销很大。因此，在某些情况下，使用clone或pthread_create创建线程可能更加高效。 ... [详细]

蜡笔小新 2023-12-12 20:00:06
上传
Webpack5内置处理图片资源的配置方法

本文介绍了在Webpack5中处理图片资源的配置方法。在Webpack4中，我们需要使用file-loader和url-loader来处理图片资源，但是在Webpack5中，这两个Loader的功能已经被内置到Webpack中，我们只需要简单配置即可实现图片资源的处理。本文还介绍了一些常用的配置方法，如匹配不同类型的图片文件、设置输出路径等。通过本文的学习，读者可以快速掌握Webpack5处理图片资源的方法。 ... [详细]

蜡笔小新 2023-12-14 15:39:51
上传
Backwardsincompatible change made.

Commit1ced2a7433ea8937a1b260ea65d708f32ca7c95eintroduceda+Clonetraitboundtom ... [详细]

蜡笔小新 2023-12-14 15:35:09
上传
引擎之旅 Chapter.2 线程库

预备知识可参考我整理的博客Windows编程之线程:https:www.cnblogs.comZhuSenlinp16662075.htmlWindows编程之线程同步:https ... [详细]

蜡笔小新 2023-12-12 14:06:39
c语言
31.项目部署

目录1一些概念1.1项目部署1.2WSGI1.3uWSGI1.4Nginx2安装环境与迁移项目2.1项目内容2.2项目配置2.2.1DEBUG2.2.2STAT ... [详细]

蜡笔小新 2023-12-12 12:15:41
config
PHP调用实现波场交互[支持TRX/TRC20]的开发包

本文介绍了一个适用于PHP应用快速接入TRX和TRC20数字资产的开发包，该开发包支持使用自有Tron区块链节点的应用场景，也支持基于Tron官方公共API服务的轻量级部署场景。提供的功能包括生成地址、验证地址、查询余额、交易转账、查询最新区块和查询交易信息等。详细信息可参考tron-php的Github地址：https://github.com/Fenguoz/tron-php。 ... [详细]

蜡笔小新 2023-12-11 17:02:09
config
操作系统的定义和功能

本文介绍了操作系统的定义和功能，包括操作系统的本质、用户界面以及系统调用的分类。同时还介绍了进程和线程的区别，包括进程和线程的定义和作用。 ... [详细]

蜡笔小新 2023-12-11 14:17:13
config
C#生成随机数的三种方法及其问题分析

本文介绍了C#中生成随机数的三种方法，并分析了其中存在的问题。首先介绍了使用Random类生成随机数的默认方法，但在高并发情况下可能会出现重复的情况。接着通过循环生成了一系列随机数，进一步突显了这个问题。文章指出，随机数生成在任何编程语言中都是必备的功能，但Random类生成的随机数并不可靠。最后，提出了需要寻找其他可靠的随机数生成方法的建议。 ... [详细]

蜡笔小新 2023-12-14 14:15:30
config
Python如何调用类里面的方法

本文介绍了在Python中调用同一个类中的方法需要加上self参数，并且规范写法要求每个函数的第一个参数都为self。同时还介绍了如何调用另一个类中的方法。详细内容请阅读剩余部分。 ... [详细]

蜡笔小新 2023-12-14 12:52:55
config
C# 7.0 新特性：基于Tuple的“多”返回值方法

本文介绍了C# 7.0中基于Tuple的“多”返回值方法的使用。通过对C# 6.0及更早版本的做法进行回顾，提出了问题：如何使一个方法可返回多个返回值。然后详细介绍了C# 7.0中使用Tuple的写法，并给出了示例代码。最后，总结了该新特性的优点。 ... [详细]

蜡笔小新 2023-12-13 19:55:20
default
开发笔记：解决播放框架内容安全策略设置不起作用的问题

本文介绍了作者在开发过程中遇到的问题，即播放框架内容安全策略设置不起作用的错误。作者通过使用编译时依赖注入的方式解决了这个问题，并分享了解决方案。文章详细描述了问题的出现情况、错误输出内容以及解决方案的具体步骤。如果你也遇到了类似的问题，本文可能对你有一定的参考价值。 ... [详细]

蜡笔小新 2023-12-13 16:03:19
install
Python操作MySQL（pymysql模块）详解及示例代码

本文介绍了使用Python操作MySQL数据库的方法，详细讲解了pymysql模块的安装和连接MySQL数据库的步骤，并提供了示例代码。内容涵盖了创建表、插入数据、查询数据等操作，帮助读者快速掌握Python操作MySQL的技巧。 ... [详细]

蜡笔小新 2023-12-10 17:50:06
install
开源Keras Faster RCNN模型介绍及代码结构解析

本文介绍了开源Keras Faster RCNN模型的环境需求和代码结构，包括FasterRCNN源码解析、RPN与classifier定义、data_generators.py文件的功能以及损失计算。同时提供了该模型的开源地址和安装所需的库。 ... [详细]

蜡笔小新 2023-12-10 17:44:07
install
Python工具安装教程及注意事项

本文介绍了在Windows系统下安装Python、setuptools、pip和virtualenv的步骤，以及安装过程中需要注意的事项。详细介绍了Python2.7.4和Python3.3.2的安装路径，以及如何使用easy_install安装setuptools。同时提醒用户在安装完setuptools后，需要继续安装pip，并注意不要将Python的目录添加到系统的环境变量中。最后，还介绍了通过下载ez_setup.py来安装setuptools的方法。 ... [详细]

蜡笔小新 2023-12-10 16:46:45
install
Python程序安全运行的三个条件及预防措施

Python已成为全球最受欢迎的编程语言之一，然而Python程序的安全运行存在一定的风险。本文介绍了Python程序安全运行需要满足的三个条件，即系统路径上的每个条目都处于安全的位置、"主脚本"所在的目录始终位于系统路径中、若python命令使用-c和-m选项，调用程序的目录也必须是安全的。同时，文章还提出了一些预防措施，如避免将下载文件夹作为当前工作目录、使用pip所在路径而不是直接使用python命令等。对于初学Python的读者来说，这些内容将有所帮助。 ... [详细]

蜡笔小新 2023-12-09 10:20:23