Real-Time pseudo 3D projections from webcam recordings

In this Article I describe a technique for Real-Time pseudo 3D projections from monoscopic webcam recordings.

Current 3D cameras for the consumer market primarily consist of two low-res CCD-Sensors and lenses with fixed positions. Devices with additional depth-sensors are often restricted to be used with certain hardware and are too massive for easy mobility. The device-sensors of stereoscopic-cameras share a single bus and the raw-data is interleaved before transmission. With limited bandwidth and only the parallax effect available for spatial calculations, these devices tend to be unusable for many areas.
Fast object movements and high computational costs for 3D calculations often prevent operations in real-time. If the purpose of the recording is only illustrative, an approximation of the depth data should be more useful. Also monoscopic devices should be sufficient for this technique. Below I describe a method to “fake” the missing depth information by the assumption that the captured data resemble spheres.
worldSpace
With the information from an edge detection pass, tiles are transposed and mapped onto a sphere surface. The sphere z-coordinate augments the fragment-plane position. Expensive and unreliable screen-space algorithms, like the determinations of light sources and the shadow length are thereby avoided.

Example: Anaglyph

Color filters are used to create different images for the left and right eye. This method doesn’t require any advanced projection methods. It is sufficient to transpose and colorize the original image:

mono3d

To enhance this effect, it is necessary to move one image-copy before and another behind the virtual focus (the center of the small sphere). Without the original image the result can be viewed here:

It is noteworthy that this method doesn’t create any real 3D data and the processing is solely done by the brain.

Example: Normal Mapping

As a next step the edge information can be used to simulate a normal map. In this example a primitive Sobel-Operator is applied to identify the edge direction:

b

The sphere z-coordinate is used to calculate the slope of the normal:

Example: Displacement Mapping

Finally the data can be applied to a displacement map. For this technique it is crucial that the main direct light-source is the monitor screen. Otherwise the 3D head projection gets bumpy:

a

Now it becomes also obvious, that stereoscopic information is missing. Fragments cannot be located above other fragments and angles above 90 degree don’t exist. Unless you want to setup a 360 degree camera grid, this limitation is unavoidable.

Bonus Example: Depth Field Projection

Inspired by a WebGL Demo from Stephane Cuillerdier I created a depth field projection. The scene-portal (instead of a deferred step from the original) is segmented into a physically based shaded (left side) and unaltered depth-data part:

depthFieldPrj

The ray-marching algorithm has a very high GPU cost but is otherwise simple to program. As usual the light pass is reversed (photon mapped) and followed from the eye-position until it collides with the surface from the camera-image.

The uniform clientCamDispAmp can be used to manually fix the surface position. entryPos and exitPos are the ray-portal intersection. The height data, calculated from the image is averaged. It means that 0.5 is at height zero and the value is also compared against its neighbors.

Further improvements

A better scene prediction could also improve the quality of the approximated 3d space. For example a reliable object/head detection would be desirable. This is typically done by down-sampling the original image followed by a pattern-recognition step:
downScale
If a sufficient number of model patterns are matched on the image above, a head could be positively identified.

OpenCL: Sorting Benchmark

Sorting Benchmarks for different programming languages.

Here are the first results for:

Sorting 50.000.000 (50 Million) random elements.

 Java 7Objective C (Heap)Scala 2.9 (Concurrent)OpenCL (Bitonic)
sec.55,68712,2383,6810,372

For OpenCL the “copy to memory” time was included and for Java and Scala the launch time was excluded from the result.

OpenCL: Base Functions

OpenCL

OpenCL programming is in several ways restricted. The local memory of a kernel is bounded to 32MB. OpenCL isn‘t available on all devices(hosts) and the driver quality differs between different vendors.

Therefore I will initially limit the availability of my library to OsX systems.

Also I abandoned the idea to embed OpenCL inside a guest language. I worked with frameworks for embedded DDL/SQL, Javascript, XML and many more. These solutions always tend to be too extensive and during the “translation process” often performance is lost.

However, If you are interested in embedded solutions take a look at:

As a consequence my library will consist of OpenCL include files. Programs have to be written in the „OpenCL Programming Language“. OpenCL commands like context creation, program compilation etc. are managed by the framework.

What are the base functions?

With OpenGL and scientific applications in mind I started with:

  • Sorting
  • Searching
  • Scheduling
  • Packing
  • LS solving

and

  • N-trees

This library will complete my toolset for developing modern desktop and tablet applications.