YOLOv3源码阅读:layer_utils.py

YOLOv3源码阅读:layer_utils.py

一、YOLO简介

  YOLO(You Only Look Once)是一个高效的目标检测算法,属于One-Stage大家族,针对于Two-Stage目标检测算法普遍存在的运算速度慢的缺点,YOLO创造性的提出了One-Stage。也就是将物体分类和物体定位在一个步骤中完成。YOLO直接在输出层回归bounding box的位置和bounding box所属类别,从而实现one-stage。

  经过两次迭代,YOLO目前的最新版本为YOLOv3,在前两版的基础上,YOLOv3进行了一些比较细节的改动,效果有所提升。

  本文正是希望可以将源码加以注释,方便自己学习,同时也愿意分享出来和大家一起学习。由于本人还是一学生,如果有错还请大家不吝指出。

  本文参考的源码地址为:https://github.com/wizyoung/YOLOv3_TensorFlow

二、代码和注释

  文件目录:YOUR_PATH\YOLOv3_TensorFlow-master_utils.py

  这里函数的主要作用是对卷积等操作做出一定的个性化封装,方便代码的编写。主要包括:

  • 卷积的封装
  • darknet网络结构的定义
  • resize的定义,默认是最近邻方法
  • 在主体网络的基础上做的YOLO的附加的卷积操作,为后面的特征融合做准备
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# coding: utf-8

from __future__ import division, print_function

import numpy as np
import tensorflow as tf

slim = tf.contrib.slim


def conv2d(inputs, filters, kernel_size, strides=1):
# 对conv2d做一定的个性化封装,方便代码的编写和阅读
def _fixed_padding(inputs, kernel_size):
pad_total = kernel_size - 1
pad_beg = pad_total // 2
pad_end = pad_total - pad_beg

padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end],
[pad_beg, pad_end], [0, 0]], mode='CONSTANT')
return padded_inputs

if strides > 1:
inputs = _fixed_padding(inputs, kernel_size)
inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides,
padding=('SAME' if strides == 1 else 'VALID'))
return inputs


def darknet53_body(inputs):
"""
darknet的主体网络框架
:param inputs:
:return: 三张不同尺度的特征图
"""
def res_block(inputs, filters):
shortcut = inputs
net = conv2d(inputs, filters * 1, 1)
net = conv2d(net, filters * 2, 3)

net = net + shortcut

return net

# first two conv2d layers
net = conv2d(inputs, 32, 3, strides=1)
net = conv2d(net, 64, 3, strides=2)

# res_block * 1
net = res_block(net, 32)

net = conv2d(net, 128, 3, strides=2)

# res_block * 2
for i in range(2):
net = res_block(net, 64)

net = conv2d(net, 256, 3, strides=2)

# res_block * 8
for i in range(8):
net = res_block(net, 128)

route_1 = net
net = conv2d(net, 512, 3, strides=2)

# res_block * 8
for i in range(8):
net = res_block(net, 256)

route_2 = net
net = conv2d(net, 1024, 3, strides=2)

# res_block * 4
for i in range(4):
net = res_block(net, 512)
route_3 = net

return route_1, route_2, route_3


def yolo_block(inputs, filters):
"""
在darknet主体网络提取特征的基础上增加的若干卷积层,为了后面的特征融合做准备
:param inputs:
:param filters:
:return:
"""
net = conv2d(inputs, filters * 1, 1)
net = conv2d(net, filters * 2, 3)
net = conv2d(net, filters * 1, 1)
net = conv2d(net, filters * 2, 3)
net = conv2d(net, filters * 1, 1)
route = net
net = conv2d(net, filters * 2, 3)
return route, net


def upsample_layer(inputs, out_shape):
"""
这一部分主要是对特征图进行resize,默认使用最近邻方法
:param inputs:
:param out_shape:
:return:
"""
new_height, new_width = out_shape[1], out_shape[2]
# NOTE: here height is the first
# TODO: Do we need to set `align_corners` as True?
inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled')
return inputs

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×