OpenPose TF版backbone使用的VGG16还是VGG19
flyfish
如何分辨backbone使用的VGG16还是VGG19?
看第二个最大池化层和第三个最大池化层之间的conv3-256
如果conv3-256的个数是3,则是VGG-16
如果conv3-256的个数是4,则是VGG-19
.max_pool(2, 2, 2, 2, name='pool2_stage1', padding='VALID')
.conv(3, 3, 256, 1, 1, name='conv3_1')
.conv(3, 3, 256, 1, 1, name='conv3_2')
.conv(3, 3, 256, 1, 1, name='conv3_3')
.conv(3, 3, 256, 1, 1, name='conv3_4')
.max_pool(2, 2, 2, 2, name='pool3_stage1', padding='VALID')
两个max_pool之间包含4个 conv3-256,所以是VGG19。
第三个max_pool之后,用了conv3-512两层
.max_pool(2, 2, 2, 2, name='pool3_stage1', padding='VALID')
.conv(3, 3, 512, 1, 1, name='conv4_1')
.conv(3, 3, 512, 1, 1, name='conv4_2')
有两个 conv3-512,所以如果数层max_pool不算,
则是用了VGG19的前10层,这个10只算卷积层,没有算最大池化层
.conv(3, 3, 64, 1, 1, name='conv1_1')
.conv(3, 3, 64, 1, 1, name='conv1_2')
.max_pool(2, 2, 2, 2, name='pool1_stage1', padding='VALID')
.conv(3, 3, 128, 1, 1, name='conv2_1')
.conv(3, 3, 128, 1, 1, name='conv2_2')
.max_pool(2, 2, 2, 2, name='pool2_stage1', padding='VALID')
.conv(3, 3, 256, 1, 1, name='conv3_1')
.conv(3, 3, 256, 1, 1, name='conv3_2')
.conv(3, 3, 256, 1, 1, name='conv3_3')
.conv(3, 3, 256, 1, 1, name='conv3_4')
.max_pool(2, 2, 2, 2, name='pool3_stage1', padding='VALID')
.conv(3, 3, 512, 1, 1, name='conv4_1')
.conv(3, 3, 512, 1, 1, name='conv4_2')
论文里说的The image is analyzed by a CNN (initialized by the first 10 layers of VGG-19 and fine-tuned), generating a set of feature maps F that is input to the first stage.
所以在理论方面确定使用的VGG19,而在实践方面可以是VGG19也可以VGG16可以对比输出结果
def conv(self,
input,
k_h,
k_w,
c_o,
s_h,
s_w,
name,
relu=True,
padding=DEFAULT_PADDING,
group=1,
trainable=True,
biased=True):
实际使用
tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding)
原始声明
tf.nn.conv2d(
input,
filters,
strides,
padding,
data_format='NHWC',
dilations=None,
name=None
)
DEFAULT_PADDING = ‘SAME’
conv(3, 3, 64, 1, 1, name=‘conv1_1’)
表示64个3*3的卷积核
s_h和s_w 分别指的是stride_height,stride_width
striders=[1, 1, 1, 1]
64为输出的维度,conv1_1是本层的名字
max_pool(2, 2, 2, 2, name=‘pool1_stage1’, padding=‘VALID’)
def max_pool(self, input, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):
self.validate_padding(padding)
return tf.nn.max_pool(input,
ksize=[1, k_h, k_w, 1],
strides=[1, s_h, s_w, 1],
padding=padding,
name=name)
ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1],
tf.nn.max_pool(
value,
ksize,
strides,
padding,
data_format='NHWC',
name=None,
input=None
)
concat
.concat(3, name=‘concat_stage2’)
def concat(self, inputs, axis, name):
return tf.concat(axis=axis, values=inputs, name=name)