Grants and Contracts Details
Description
This proposal is about making computer vision work in real, outdoor scenes, all the
time. This is not possible within the classic computer vision framework that seeks
algorithms that work one image or one video at a time. But today, essentially all imagery
comes with GPS data and time-stamps, and therefore all imagery can be analyzed in
the context of known weather conditions, known digital elevation maps, approximate 3D
geometry, and example images taken earlier from similar locations. The goal of my
proposal is to use this contextual information to improve the accuracy and reliability of
computer vision algorithms, especially for problems of detection, tracking, and activity
recognition.
Approach To organize our approach, we will use a geo-temporal image formation
model, which describes the probabilistic relationships between image appearance and
the underlying contextual cues. These cues, such as the time-of-day and the weather,
will be used to enable the use of specialized context-sensitive algorithms. We will use a
geo-spatial sensing almanac to store and provide access to the contextual information,
this eliminates the need to guess the local conditions. The image formation model,
together with the almanac, provides a principled means of using previous imagery and
other sensor data from the same location, as well as other, similar, locations.
Application Areas In this proposal, we use this framework to improve traditional
surveillance problems of detection, tracking, and activity recognition. Explicitly modeling
the causes of image changes will improve the performance of object detection (e.g.
reduced false positives and negatives), and make tracking algorithms work in image
conditions that are impossible today. Using the contextual cues of place, time, and
weather naturally extends beyond low-level image features to also give context for
activity detection.
Long-Term Vision My research agenda is to understand the natural world through
long-term observations, and to understand and generate the contextual cues that
support fast, accurate interpretation of current imagery. The core of this effort is the
development of a framework for interpreting imagery, that I call the geo-temporal image
formation model, which models how time, geo-location, weather, scene objects, and the
camera itself define an image that is captured. Current computer vision algorithms fail
because of the challenges of real, natural scenes. Explicitly modeling these challenges
such as fog, specular glare, and shadows, gives a concrete direction forward to build
systems that can overcome them.
Expertise As part of my thesis, I built the largest academic data repository of outdoor
time-lapse imagery (more than 60 million images from 13 000 outdoor cameras). This
data set is widely shared and I have published the first papers for camera geo-location,
geo-calibration, and using varying time-scales of imagery for 3D model estimation and
improved surveillance and tracking. I have also worked for Object Video on production
systems in use for surveillance.
Status | Finished |
---|---|
Effective start/end date | 4/20/11 → 4/19/12 |
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.