Direct pose regression using deep convolutional neural networks has become a highly active research area. However, even with significant improvements in performance in recent years, the best performance comes from training distinct, scene-specific networks. We propose a novel architecture, Multi-Scene PoseNet (MSPN), that allows for a single network to be used on an arbitrary number of scenes with only a small scene-specific component. Using our approach, we achieve competitive performance for two bench-mark 6DOF datasets, Microsoft 7Scenes and Cambridge Landmarks, while reducing the total number of network parameters significantly. Additionally, we demonstrate that our trained model serves as a better initialization for fine-tuning on new scenes compared to the standard ImageNet initialization, converging to lower error solutions within only a few epochs.
|Title of host publication||Proceedings - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2020|
|Number of pages||9|
|State||Published - Jun 2020|
|Event||2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2020 - Virtual, Online, United States|
Duration: Jun 14 2020 → Jun 19 2020
|Name||IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops|
|Conference||2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2020|
|Period||6/14/20 → 6/19/20|
Bibliographical noteFunding Information:
Acknowledgements We gratefully acknowledge the support of the US Air Force Research Laboratory, Sensors Directorate (FA8650-13-D-1547) and the National Science Foundation (IIS-1553116).
© 2020 IEEE.
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering