| CoSPE: Computer vision based Scene Parameter Estimation |
| HOME | Background & Potential Impact | Related Work | People | News & Events | Results & Publications | Internal |
|---|
Machines can only handle a small subset of what humans can do, e.g., build a 3D map or estimate the light conditions in rather constrained environments. Enabling machines to quantitatively estimate scene maps, like humans do qualitatively, would open for various applications, among them photorealistic computer graphics, and robust computer vision methods, e.g., for autonomous vehicle navigation.
Creating artificial images that look as realistic as the real world has been a major focus in computer graphics research during the last 2-3 decades. Nowadays mathematical descriptions of light reflection and transport exist as well as algorithms to synthesise physically accurate and photorealistic images.
The process of synthesising an image is called rendering in computer graphics, and rendering an image requires a 3D description of the scene to be visualised including light sources and material properties. In order to make rendered images of real world scenes appear photorealistic the light sources and material properties - also called scene parameters - need to be accurately measured or estimated. Although very impressive computer graphics applications showing artificially generated real world scenes can be seen in movies such as "jurassic park" and "lord of the rings", recovering 3D scene descriptions with light source and material properties is only possible if special calibration objects are present in the scene. Furthermore, manual interaction is required for recovering these scene descriptions. The most convenient way of doing this would be by capturing images of the scene and doing the inverse process of visualisation, i.e., 3D reconstruction and inverse rendering/scene parameter estimation. This is an active research area that is getting internationally more and more attention due to its potentially high number of applications.
Inverse rendering and 3D reconstruction enables not only for re-visualisation of a given scene. More interestingly the scene may be "re-lighted", e.g., to simulate other light sources, to soften the intensity difference between sunlit and shadowed regions, or to simulate another time of day. Consider for example the image of a kitchen below where the window area is too light whereas the walls next to the window are too dark, and the light reflections on the kitchen board are too yellow whereas those on the floor are too bluish. Although the image shows correctly how the scene looks like, it is far from what we would perceive when being physically there. We would be able to see both the window and the wall "normally", and would compensate for the illumination colour differences. Re-lighting could improve the image by, e.g., simulating the use of a flash or the illumination of another time of day.

Knowing the scene parameters also makes it possible to remove objects from the scene or to add new virtual objects to the scene, e.g., in order to visualise a house to be built in a certain area. Given the 3D model, illumination and material properties it is then possible to visualise the scene including interreflections and shadows caused by the old (real) and the new (virtual) objects. This is called Augmented or Mixed Reality, a mixture of "reality" and "virtual reality", which is getting more and more popular also for outdoor applications, e.g., in urban planning.
Another area where robust scene parameter estimation would be the key for a major step forward is in computer vision. The majority of current computer vision methods are only working in constrained laboratory environments. There are no methods, e.g., able to detect the kitchen floor in the image above with its bluish areas as one single object. Being able to estimate illumination and reflectance would enable for doing a robust segmentation. Other computer vision application areas are in autonomous robot navigation or driver assistant systems that estimate the surface properties of the road, classifies them to wet/dry/ice and warns the driver according to the road conditions. It may even be used to change the parameters of the car's ESP (Electronic Stability Program) and ABS (Anti-lock Braking Systems).
Many approaches have been proposed in the literature for 3D reconstruction and inverse rendering given the images of a scene. Although these two areas highly depend on each other, i.e., inverse rendering requires an accurate 3D model of the scene, and 3D reconstruction could profit from knowledge of scene parameters, they are placed in two different research communities. While inverse rendering research is traditionally done in the computer graphic community, 3D reconstruction is a computer vision field. In both fields the usability of the methods is constrained because several assumptions have to be made, e.g., only diffuse surfaces, controllable illumination, no interreflections, or the presence of calibration objects in the images such as a mirrored sphere/ball. Most methods are restricted to static scenes and laboratory conditions - they cannot cope with the high intensity ranges inherent to outdoor scenes. Furthermore, within inverse rendering only a few approaches exist that deal with both problems, the estimation of illumination and material property, at the same time.
The hypothesis of this project is that the combination of several complementary inverse rendering and 3D reconstruction methods will enable for iteratively refining the outputs of these methods, i.e., estimates of the scene parameters and scene models. This will allow relaxing some of the assumptions that are usually made and result in higher accuracy and robustness. In particular, we suggest further developing and fusing of several inverse rendering methods and combing them with state-of-the-art 3D reconstruction methods in order to achieve more robust performance in unconstrained outdoor scenes including high intensity ranges.