Abstract
We propose to use deep convolutional neural networks to address the problem of cross-view image geolocalization, in which the geolocation of a ground-level query image is estimated by matching to georeferenced aerial images. We use state-of-the-art feature representations for ground-level images and introduce a cross-view training approach for learning a joint semantic feature representation for aerial images. We also propose a network architecture that fuses features extracted from aerial images at multiple spatial scales. To support training these networks, we introduce a massive database that contains pairs of aerial and ground-level images from across the United States. Our methods significantly out-perform the state of the art on two benchmark datasets. We also show, qualitatively, that the proposed feature representations are discriminative at both local and continental spatial scales.
Original language | English |
---|---|
Title of host publication | 2015 International Conference on Computer Vision, ICCV 2015 |
Pages | 3961-3969 |
Number of pages | 9 |
ISBN (Electronic) | 9781467383912 |
DOIs | |
State | Published - Feb 17 2015 |
Event | 15th IEEE International Conference on Computer Vision, ICCV 2015 - Santiago, Chile Duration: Dec 11 2015 → Dec 18 2015 |
Publication series
Name | Proceedings of the IEEE International Conference on Computer Vision |
---|---|
Volume | 2015 International Conference on Computer Vision, ICCV 2015 |
ISSN (Print) | 1550-5499 |
Conference
Conference | 15th IEEE International Conference on Computer Vision, ICCV 2015 |
---|---|
Country/Territory | Chile |
City | Santiago |
Period | 12/11/15 → 12/18/15 |
Bibliographical note
Publisher Copyright:© 2015 IEEE.
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition