Contextual Eyes: A Context-Aware Surveaillance System

  • Jacobs, Nathan (PI)

Grants and Contracts Details


This proposal is about making computer vision work in real, outdoor scenes, all the time. This is not possible within the classic computer vision framework that seeks algorithms that work one image or one video at a time. But today, essentially all imagery comes with GPS data and time-stamps, and therefore all imagery can be analyzed in the context of known weather conditions, known digital elevation maps, approximate 3D geometry, and example images taken earlier from similar locations. The goal of my proposal is to use this contextual information to improve the accuracy and reliability of computer vision algorithms, especially for problems of detection, tracking, and activity recognition. Approach To organize our approach, we will use a geo-temporal image formation model, which describes the probabilistic relationships between image appearance and the underlying contextual cues. These cues, such as the time-of-day and the weather, will be used to enable the use of specialized context-sensitive algorithms. We will use a geo-spatial sensing almanac to store and provide access to the contextual information, this eliminates the need to guess the local conditions. The image formation model, together with the almanac, provides a principled means of using previous imagery and other sensor data from the same location, as well as other, similar, locations. Application Areas In this proposal, we use this framework to improve traditional surveillance problems of detection, tracking, and activity recognition. Explicitly modeling the causes of image changes will improve the performance of object detection (e.g. reduced false positives and negatives), and make tracking algorithms work in image conditions that are impossible today. Using the contextual cues of place, time, and weather naturally extends beyond low-level image features to also give context for activity detection. Long-Term Vision My research agenda is to understand the natural world through long-term observations, and to understand and generate the contextual cues that support fast, accurate interpretation of current imagery. The core of this effort is the development of a framework for interpreting imagery, that I call the geo-temporal image formation model, which models how time, geo-location, weather, scene objects, and the camera itself define an image that is captured. Current computer vision algorithms fail because of the challenges of real, natural scenes. Explicitly modeling these challenges such as fog, specular glare, and shadows, gives a concrete direction forward to build systems that can overcome them. Expertise As part of my thesis, I built the largest academic data repository of outdoor time-lapse imagery (more than 60 million images from 13 000 outdoor cameras). This data set is widely shared and I have published the first papers for camera geo-location, geo-calibration, and using varying time-scales of imagery for 3D model estimation and improved surveillance and tracking. I have also worked for Object Video on production systems in use for surveillance.
Effective start/end date4/20/117/31/15


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.