Imperial College

Research Group

Visual Information Processing in Multi-Camera Systems

We consider a dense multi-camera system and assume that a large number of cameras is monitoring a certain scene from multiple viewpoints. The aim then is to fuse all this data acquired to perform segmentation, scene interpretation and classification, preferably in an automatic fashion.Traditional algorithms do not scale with the number of cameras and become impracticable when the number of images acquired is large.  The  data acquired by multiple cameras from multiple viewpoints can be parameterized with a single function called the plenoptic function.  The  aim therefore is to perform segmentation and  scene interpretation  directly  in the plenoptic domain.  Segmentation is achieved using the level set method  and, by segmenting the multi-view images jointly, we can deal with occlusions efficiently.  The extracted hypervolumes can be used for classification, interpolation or for augmented reality as shown below.

Duck Sequence Extracted Hypervolumes Disocclusion Augmented reality
(a) (b) (c) (d)

Figure 1(a): Three images of a set of 32 multi-view images.  Fig 1(b):  An example of the hypervolume extracted and then interpolated using the level-set method. Fig1(c): The duck hypervolume is removed and what is 'behind' is estimated. Fig1.(d): A synthetic object is inserted.

To probe further check out the following videos and Jesse Berent web-page.

Main Publication:

PhD Students:
Jesse Berent and Yizhou (Eagle) Wang.

Collaborations and Interactions:  M. Brookes (ICL), M. Vetterli (EPFL).

Department Home