Grants and Contracts Details
Description
As NGA and other government agencies collect more data, it becomes increasingly
difficult for analysts to keep up. Requiring analysts to reason across multiple modalities
of data and/or viewpoints adds a lot of complexity to this task, resulting in either
significantly slower analysis or analysts skipping data entirely. The proposed work seeks
to automate the aggregation of multimodal data, thereby enabling analysts to more easily
access and analyze relevant data for an operation. One can imagine boosting the
performance of both human and algorithmically-performed tasks (such as object
detection, change detection, etc.) by automatically aggregating relevant data across
modalities.
To keep this work grounded and ensure that success is measureable, we intend to focus
on the problem of UAS localization. Obtaining an accurate georeferenced pose of a UAS
is critical to obtaining the geospatial positions of targets recognized in sensor data.
However, GPS-denied environments and inaccurate GPS make this problem much more
difficult. Having an up-to-date base map (e.g., satellite image of the area) for registration
enables an alternative way of estimating an accurate georeferenced pose. However, one
does not always have access to up-to-date base maps. Furthermore, if we can utilize
base maps derived from drastically different modalities and perspectives than the sensors
on-board the UAS, then we can dramatically broaden the utility of these methods.
Our proposed work will enable geolocating UAS systems in (possibly out-of-date) base
maps from different modalities by leveraging recent developments in deep learning. In
other words, we seek to develop algorithms capable of learning common, cross-modal,
and semantic representations between UAS sensor data and base maps to enable
localization that leverages all available data sources. We envision investigating modalities
such as synthetic-aperture radar (SAR), electro-optical (EO), audio, and magnetic data,
as well as across varying perspectives (e.g., nadir vs. oblique views for imagers). As
opposed to algorithms that learn low-level visual/3D features, which are unlikely to
generalize across modalities, we will investigate algorithms capable of extracting higherlevel
semantic representations (e.g., road networks, water sources) across modalities.
We will also target architectures capable of learning how to use the 3D layout of the scene
as the unmanned aerial system (UAS) navigates the environment.
Status | Finished |
---|---|
Effective start/end date | 6/1/20 → 5/31/22 |
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.