CoSPE: Computer vision based Scene Parameter Estimation
HOME Background & Potential Impact Related Work People News & Events Results & Publications Internal

The Project Vision

The world we see is the result of light interacting with materials that have certain reflectance and transmittance properties. When we move around we implicitly build a map of the scene surrounding us knowing where light sources and objects are. We know how light sources are oriented and of which kind they are, and we have an idea about the object's orientation and surface properties, e.g. glossy or diffuse reflecting. We use this information to control our movements, recognise and find out about objects, appreciate them, or just to avoid being dazzled from a glossy surface by slightly changing viewing direction.

Machines can only handle a small subset of what humans can do, e.g., build a 3D map or estimate the light conditions in rather constrained environments. Enabling machines to quantitatively estimate scene maps, like humans do qualitatively, would open for various applications, among them photorealistic computer graphics, and robust computer vision methods, e.g., for autonomous vehicle navigation.

Background & Potential Impact

The image formation process is the result of light interacting with materials. The radiation emitted by a light source hits a material's surface under a certain angle where it is then reflected, absorbed, and transmitted depending on the material's properties, e.g., transparent, or glossy/diffuse reflecting. The reflected light may hit other objects causing interreflections, and one object may occlude another object's reflections or a light source resulting in shadowing.

Creating artificial images that look as realistic as the real world has been a major focus in computer graphics research during the last 2-3 decades. Nowadays mathematical descriptions of light reflection and transport exist as well as algorithms to synthesise physically accurate and photorealistic images.

The process of synthesising an image is called rendering in computer graphics, and rendering an image requires a 3D description of the scene to be visualised including light sources and material properties. In order to make rendered images of real world scenes appear photorealistic the light sources and material properties - also called scene parameters - need to be accurately measured or estimated. Although very impressive computer graphics applications showing artificially generated real world scenes can be seen in movies such as "jurassic park" and "lord of the rings", recovering 3D scene descriptions with light source and material properties is only possible if special calibration objects are present in the scene. Furthermore, manual interaction is required for recovering these scene descriptions. The most convenient way of doing this would be by capturing images of the scene and doing the inverse process of visualisation, i.e., 3D reconstruction and inverse rendering/scene parameter estimation. This is an active research area that is getting internationally more and more attention due to its potentially high number of applications.

Inverse rendering and 3D reconstruction enables not only for re-visualisation of a given scene. More interestingly the scene may be "re-lighted", e.g., to simulate other light sources, to soften the intensity difference between sunlit and shadowed regions, or to simulate another time of day. Consider for example the image of a kitchen below where the window area is too light whereas the walls next to the window are too dark, and the light reflections on the kitchen board are too yellow whereas those on the floor are too bluish. Although the image shows correctly how the scene looks like, it is far from what we would perceive when being physically there. We would be able to see both the window and the wall "normally", and would compensate for the illumination colour differences. Re-lighting could improve the image by, e.g., simulating the use of a flash or the illumination of another time of day.

Knowing the scene parameters also makes it possible to remove objects from the scene or to add new virtual objects to the scene, e.g., in order to visualise a house to be built in a certain area. Given the 3D model, illumination and material properties it is then possible to visualise the scene including interreflections and shadows caused by the old (real) and the new (virtual) objects. This is called Augmented or Mixed Reality, a mixture of "reality" and "virtual reality", which is getting more and more popular also for outdoor applications, e.g., in urban planning.

Another area where robust scene parameter estimation would be the key for a major step forward is in computer vision. The majority of current computer vision methods are only working in constrained laboratory environments. There are no methods, e.g., able to detect the kitchen floor in the image above with its bluish areas as one single object. Being able to estimate illumination and reflectance would enable for doing a robust segmentation. Other computer vision application areas are in autonomous robot navigation or driver assistant systems that estimate the surface properties of the road, classifies them to wet/dry/ice and warns the driver according to the road conditions. It may even be used to change the parameters of the car's ESP (Electronic Stability Program) and ABS (Anti-lock Braking Systems).

Many approaches have been proposed in the literature for 3D reconstruction and inverse rendering given the images of a scene. Although these two areas highly depend on each other, i.e., inverse rendering requires an accurate 3D model of the scene, and 3D reconstruction could profit from knowledge of scene parameters, they are placed in two different research communities. While inverse rendering research is traditionally done in the computer graphic community, 3D reconstruction is a computer vision field. In both fields the usability of the methods is constrained because several assumptions have to be made, e.g., only diffuse surfaces, controllable illumination, no interreflections, or the presence of calibration objects in the images such as a mirrored sphere/ball. Most methods are restricted to static scenes and laboratory conditions - they cannot cope with the high intensity ranges inherent to outdoor scenes. Furthermore, within inverse rendering only a few approaches exist that deal with both problems, the estimation of illumination and material property, at the same time.

The hypothesis of this project is that the combination of several complementary inverse rendering and 3D reconstruction methods will enable for iteratively refining the outputs of these methods, i.e., estimates of the scene parameters and scene models. This will allow relaxing some of the assumptions that are usually made and result in higher accuracy and robustness. In particular, we suggest further developing and fusing of several inverse rendering methods and combing them with state-of-the-art 3D reconstruction methods in order to achieve more robust performance in unconstrained outdoor scenes including high intensity ranges.


Download the project proposal as PDF


Maintained by the CVMT webmasters <webmaster@cvmt.dk>
Last updated: Fri Apr 08, 2005
Last modified: Thu Apr 21 16:21:16 MET DST 2005