1. 首页
  2. 人工智能
  3. 论文/代码
  4. G-RCN: Optimizing the Gap between Classification and Localization Tasks for Obje

G-RCN: Optimizing the Gap between Classification and Localization Tasks for Obje

上传者: 2021-01-24 07:01:34上传 .PDF文件 872.41 KB 热度 16次

G-RCN: Optimizing the Gap between Classification and Localization Tasks for Object Detection

Multi-task learning is widely used in computer vision. Currently, object detection models utilize shared feature map to complete classification and localization tasks simultaneously.By comparing the performance between the original Faster R-CNN and that with partially separated feature maps, we show that: (1) Sharing high-level features for the classification and localization tasks is sub-optimal; (2) Large stride is beneficial for classification but harmful for localization; (3) Global context information could improve the performance of classification. Based on these findings, we proposed a paradigm called Gap-optimized region based convolutional network (G-RCN), which aims to separating these two tasks and optimizing the gap between them. The paradigm was firstly applied to correct the current ResNet protocol by simply reducing the stride and moving the Conv5 block from the head to the feature extraction network, which brings 3.6 improvement of AP70 on the PASCAL VOC dataset and 1.5 improvement of AP on the COCO dataset for ResNet50. Next, the new method is applied on the Faster R-CNN with backbone of VGG16,ResNet50 and ResNet101, which brings above 2.0 improvement of AP70 on the PASCAL VOC dataset and above 1.9 improvement of AP on the COCO dataset. Noticeably, the implementation of G-RCN only involves a few structural modifications, with no extra module added.

G-RCN:优化分类和本地化任务之间的差距,以进行对象检测

多任务学习在计算机视觉中被广泛使用。当前,对象检测模型利用共享特征图来同时完成分类和定位任务。.. 通过比较原始Faster R-CNN与部分分离的特征图的性能,我们发现:(1)共享用于分类和定位任务的高级特征不是最优的;(2)大步幅对分类有利,但对定位不利;(3)全局上下文信息可以提高分类的性能。基于这些发现,我们提出了一种名为基于间隙优化的基于区域的卷积网络(G-RCN)的范例,该范例旨在分离这两个任务并优化它们之间的差距。该范式首先通过简单地减小步幅并将Conv5块从头部移到特征提取网络而用于纠正当前的ResNet协议,这为PASCAL VOC数据集上的AP70和3.6带来了3.6的改进。对ResNet50的COCO数据集的AP的5处改进。接下来,将新方法应用于具有VGG16,ResNet50和ResNet101主干的Faster R-CNN,这将使PASCAL VOC数据集的AP70改善2.0以上,而COCO数据集的AP改善1.9以上。值得注意的是,G-RCN的实现仅涉及一些结构上的修改,而没有添加额外的模块。 (阅读更多)

下载地址
用户评论