every blog every motto: Stay hungry, stay foolish.
0. 前言主要讲下实际卷积运算中关于padding=same和padding=valid的输出特征图的大小,以及池化后特征图的大小。
1. 正文 1. 卷积运算特别说明:卷积(除不尽)向下取整!!!!
特别说明:卷积(除不尽)向下取整!!!!
特别说明:卷积(除不尽)向下取整!!!!
关于卷积的基本运算参照卷积运算和运算后特征图大小计算1
参数定义:
输入大小:intputH
卷积核大小:K
步长:S
填充:P
输出图像大小:outputH
不进行填充,P=0,输入特征图大小为:
outputH=intputH−K+2∗0S+1
outputH = \frac{intputH-K+2*0}{S}+1
outputH=SintputH−K+2∗0+1
即:
outputH=intputH−KS+1
outputH = \frac{intputH-K}{S}+1
outputH=SintputH−K+1
卷积(除不尽)向下取整
依据卷积核的大小,填充不同
K=1,P=0;
K=3,P=1;
K=5,P=3;
依次类推。
outputH=intputH−K+2PS+1
outputH = \frac{intputH-K+2P}{S}+1
outputH=SintputH−K+2P+1
卷积(除不尽)向下取整
特别说明:池化(除不尽)向上取整!!!!
特别说明:池化(除不尽)向上取整!!!!
特别说明:池化(除不尽)向上取整!!!!
池化没有填充,公式如下:
outputH=intputH−KS+1
outputH = \frac{intputH-K}{S}+1
outputH=SintputH−K+1
VGG部分代码,基于tensorflow1.x
3.1 练习一# 416,416,3 -> 208,208,64
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
y = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
z = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
f1 = z
说明: Keras中Conv2D默认步长为1,具体参数可参考文献1
代入公式:
第一个卷积:
XoutputH=416−3+21+1=416
XoutputH = \frac{416-3+2}{1}+1=416
XoutputH=1416−3+2+1=416
第二个卷积:
YoutputH=416−3+21+1=416
YoutputH = \frac{416-3+2}{1}+1=416
YoutputH=1416−3+2+1=416
池化:
ZoutputH=416−22+1=208
ZoutputH = \frac{416-2}{2}+1=208
ZoutputH=2416−2+1=208
至此: 输入出现由416 * 416 * 3 ⇒ 208 * 208 * 64 ( 64为卷积核个数,即Conv2D的第一个参数,具体可参考文献2
# 208,208,64 -> 104,104,128
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
y = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
z = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
f2 = z
说明: Keras中Conv2D默认步长为1,具体参数可参考文献1
代入公式:
第一个卷积:
XoutputH=208−3+21+1=208
XoutputH = \frac{208-3+2}{1}+1=208
XoutputH=1208−3+2+1=208
第二个卷积:
YoutputH=208−3+21+1=208
YoutputH = \frac{208-3+2}{1}+1=208
YoutputH=1208−3+2+1=208
池化:
ZoutputH=208−22+1=104
ZoutputH = \frac{208-2}{2}+1=104
ZoutputH=2208−2+1=104
至此: 输入出现由208* 208 * 3 ⇒ 104 * 104 * 128 ( 128为卷积核个数,即Conv2D的第一个参数,具体可参考文献2
由上例,可以发现,VGG的这种结构(卷积核 3 * 3 ,padding=same;池化 2 * 2,strides=(2,2))会让输出特征图长宽变成输入特征图长宽的一半。输出特征图通道数由卷积核个数决定!!!!
写在最后:
卷积(除不尽)向下取整,池化(除不尽)向上取整。
卷积(除不尽)向下取整,池化(除不尽)向上取整。
卷积(除不尽)向下取整,池化(除不尽)向上取整。
卷积(除不尽)向下取整,池化(除不尽)向上取整。
[1] https://blog.csdn.net/econe_wei/article/details/94649003
[2] https://blog.csdn.net/weixin_39190382/article/details/105692853
[3] https://blog.csdn.net/weixin_38705903/article/details/89073938?depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-1&utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-1
[4] https://blog.csdn.net/bohuihuan8714/article/details/89894124?depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-3&utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-3
[5] https://blog.csdn.net/AugustMe/article/details/92096724?depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-5&utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-5
[6] https://keras-cn.readthedocs.io/en/latest/layers/pooling_layer/#maxpooling2d
[7] https://keras-cn.readthedocs.io/en/latest/layers/convolutional_layer/