当前位置: 首页 > 图灵资讯 > 技术篇> pytorch ssd网络结构

pytorch ssd网络结构

来源:图灵教育
时间:2023-05-19 09:11:56

amdegroot/ssd.pytorch 代码来源

pytorch ssd网络结构_ide

vgg基础网络

网络的backbone为vgg,构建vgg网络代码如下,输入为vgg各卷积层通道数和池化层参数cfg,输入图像通道数i,

上图中的Conv6和Conv7对应于Conv6和Conv7两个19*19*1024的特征图,其中Conv7用于预测

我不得不说,这个代码写得很棒

cfg = [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'C', 512, 512, 512, 'M',             512, 512, 512]   def vgg(cfg, i, batch_norm=False):     layers = []     in_channels = i     for v in cfg:         if v == 'M':             layers += [nn.MaxPol2d(kernel_size=2, stride=2)]         elif v == 'C':             layers += [nn.MaxPol2d(kernel_size=2, stride=2, ceil_mode=True)]         else:             conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)             if batch_norm:                 layers += [conv2d, nn.Batchnorm2d(v), nn.ReLU(inplace=True)]             else:                 layers += [conv2d, nn.ReLU(inplace=True)]             in_channels = v     pool5 = nn.MaxPol2d(kernel_size=3, stride=1, padding=1)     conv6 = nn.Conv2d(512, 1024, kernel_size=3, padding=6, dilation=6)     conv7 = nn.Conv2d(1024, 1024, kernel_size=1)     layers += [pool5, conv6,                nn.ReLU(inplace=True), conv7, nn.ReLU(inplace=True)]     return layers

 

二、额外的卷积层

SSD在基础vgg网络后面添加了额外的卷积层,通过这些卷积层获得的feature 19*19、10*10、5*5、3*3、1*1等map逐级减少,

通过预测这些层,可以获得多尺度的效果。构建代码如下,因为Conv6和Conv7是在之前的vgg函数中构建的,

这里的输入是Conv7获得的19*19*1024feature map。cfg是要构建的卷积层的通道数,s表示这一层需要stride=2

cfg = [256, 'S', 512, 128, 'S', 256, 128, 256, 128, 256] def add_extras(cfg, i, batch_norm=False): # Extra layers added to VGG for feature scaling # 1*1和3*3的卷积交替交替 layers = [] in_channels = i flag = False for k, v in enumerate(cfg): if in_channels != 'S': if v

== 'S':                 layers += [nn.Conv2d(in_channels, cfg[k + 1],                            kernel_size=(1, 3)[flag], stride=2, padding=1)]             else:                 layers += [nn.Conv2d(in_channels, v, kernel_size=(1, 3)[flag])]             flag = not flag         in_channels = v     return layers

三、multibox层

这一层是与图1中的classifier相对应的分类和位置回归层。loc_layers由6个输出维度default_box_num * 4.3*3卷积层组成,

conf_layers由6个输出维度default_box_num * class_num的3*3卷积层 cfg = [4, 6, 6, 6, 4, 4] #每个预测层的defaultt box个数 def multibox(vgg, extra_layers, cfg, num_classes):

 

loc_layers = []  #为loc回归层和conf预测层创建两个listt     conf_layers = []     #在vgglayer中,索引为21和-2的卷积输出feature map对应于38*38*512和19*19*1024的预测层     vgg_source = [21, -2]     for k, v in enumerate(vgg_source):         loc_layers += [nn.Conv2d(vgg[v].out_channels,                                  cfg[k] * 4, kernel_size=3, padding=1)]         conf_layers += [nn.Conv2d(vgg[v].out_channels,                         cfg[k] * num_classes, kernel_size=3, padding=1)]     for k, v in enumerate(extra_layers[1::2], 2):     #k从2开始     #每两个索引在extra_layerslist中获取一次,即取3*3conv层         loc_layers += [nn.Conv2d(v.out_channels, cfg[k]                                  * 4, kernel_size=3, padding=1)]         conf_layers += [nn.Conv2d(v.out_channels, cfg[k]                                   * num_classes, kernel_size=3, padding=1)]     return vgg, extra_layers, (loc_layers, conf_layers)

四、default_box的生成

这部分代码在prior_box.PriorBox在py文件中。

根据不同的scale和ratio,对每个特征图生成kdefaultt boxes

def forward(self):         """         测试中的参数如下         feature_maps: <class 'list'>: [38, 19, 10, 5, 3, 1]         steps: <class 'list'>: [8, 16, 32, 64, 100, 300]         min_size: <class 'list'>: [30, 60, 111, 162, 213, 264]         max_size: <class 'list'>: [60, 111, 162, 213, 264, 315]         aspect_ratios: <class 'list'>: [[2], [2, 3], [2, 3], [2, 3], [2], [2]]         """         mean = []         #6个特征图遍历         for k, f in enumerate(self.feature_maps):             #遍历特征图中的每一个点             for i, j in product(range(f), repeat=2):                 f_k = self.image_size / self.steps[k] #特征图一点对应原图的大小                 # unit center x,y                 #以上默认框中心设置的公式计算默认框中心,将每一点归一化                 cx = (j + 0.5) / f_k                 cy = (i + 0.5) / f_k                   # aspect_ratio: 1、长宽比为1,以min_size为边长的默认框架                 # rel size: min_size                 s_k = self.min_sizes[k]/self.image_size                 mean += [cx, cy, s_k, s_k]                   # aspect_ratio: 1.当长宽比为1时,增加一个额外的尺度                 # rel size: 计算公式sqrt(s_k * s_(k+1)),s_k+等于这一层的max__size                 s_k_prime = sqrt(s_k * (self.max_sizes[k]/self.image_size))                 mean += [cx, cy, s_k_prime, s_k_prime]                   # rest of aspect ratios,计算其他长宽比的默认框,按公式分别遵循w和h                 for ar in self.aspect_ratios[k]:                     mean += [cx, cy, s_k*sqrt(ar), s_k/sqrt(ar)]                     mean += [cx, cy, s_k/sqrt(ar), s_k*sqrt(ar)]         # back to torch land,output是(8732,4)Tensor         output = torch.Tensor(mean).view(-1, 4)         if self.clip:             output.clamp_(max=1, min=0) #output的控制范围为[0,1]         return output