In the following sections, we summarize the most important research results within the cluster context. These results form the basis for key parts of the VR demonstrator.
Web technologies provide the basis to distribute digital information worldwide and have changed the way we live and work. A case in point from the recent past is video. While all the technology has been there, it was little used. This, however, changed dramatically as video became available on the Web: it's now a multi-billion industry. We see a similar situation with interactive 3D: all the required technology is already present in every PC and most mobile phones, and it is waiting to be unleashed by extending the core Web technologies to support interactive 3D content.
Instead of adapting existing graphics technologies to the Web, Karrenberg et al. used a more radical approach when developing XML3D: They take today's Web technology and try to find the minimum set of extensions that fully support interactive 3D content as an integral part of mixed 2D/3DWeb documents. The design of XML3D is based on modern programmable graphics hardware and reuses a lot of the other technology developed in RA7. They demonstrate the feasibility of their approach by integrating XML3D support into two major open browser frameworks from Mozilla and WebKit, as well as providing a portable implementation based on JavaScript and WebGL. XML3D and related technologies are now developed jointly by groups at the MMCI, Intel VCI, and DFKI.
Accurately rendering glossy materials remains a major challenge. In collaboration with Cornell University, Davidovic et al. introduced a Monte-Carlo solution that separately solves for the low-rank and high-rank (but sparse, i.e. glossy) components of illumination in a scene. For the low-rank component, they introduce visibility clustering and other approximations. For the high rank component, they use a local light technique to correct for any missing illumination. Compared to competing techniques the approach achieves accurate gloss rendering in minutes, making the technique suitable for previewing applications, such as industrial design, e-commerce, and architecture, where material appearance is critical.
Many lighting simulation techniques operate by starting computations from the light sources, such as Instant Global Illumination. However, they have the problem that most of these computations end up being irrelevant for most views in a large virtual environment. To mitigate this problem, Georgiev et al. have developed a simple and practical algorithm for importance sampling virtual point lights (VPLs). During VPL distribution, a Russian roulette decision accepts each VPL proportionally to its estimated contribution to the final image. As a result, more VPLs are concentrated in areas that illuminate the visible parts of the scene, at the cost of a negligible increase in performance overhead in the preprocessing phase. Most interestingly, the approach is trivial to parallelize and remains efficient for low sampling rates.
Computing global illumination in complex scenes is a demanding task even with today's computational power. Herzog et al. proposed a novel irradiance caching scheme that combines the advantages of two state-of-the-art algorithms for high-quality global illumination rendering: Lightcuts, an adaptive and hierarchical instant-radiosity based algorithm, and the widely used (ir)radiance caching algorithm for sparse sampling and interpolation of (ir)radiance in object space. They achieve significantly better image quality while also speeding up the computation by one to two orders of magnitude with respect to the well-known photon mapping with (ir)radiance caching procedure.
Recent approaches to global illumination for dynamic scenes achieve interactive frame rates by using coarse approximations to geometry, lighting, or both, which limits scene complexity and rendering quality. Ritschel et al. presented an efficient and scalable method to compute global illumination solutions at interactive rates for complex and dynamic scenes. Their method is based on parallel final gathering running entirely on the GPU. At each final gathering location they perform micro-rendering: they traverse and rasterize a hierarchical point-based scene representation into an importance-warped micro-buffer, which allows for BRDF importance sampling. The approach allows quality to be traded for speed by reducing the sampling rate of the gathering locations in conjunction with bilateral upsampling.
A new approach to screen space ambient occlusion (SSAO) has been presented by Ritschel et al., which adds effects such as directional shadows and indirect color bleeding. The proposed generalization has only a small overhead compared to classic SSAO, approximates direct and one-bounce light transport in screen space, can be combined with other methods that simulate transport for macro structures, and is visually equivalent to SSAO in the worst case without introducing new artifacts. Since their method works in screen space, it does not depend on geometric complexity. Plausible directional occlusion and indirect lighting effects can be displayed for large and fully dynamic scenes at real-time frame rates.
Another approach brings the user into the loop to obtain special effects, such as changing the reflections on an object that would contradict physical laws to better match vis artistic vision. Ritschel et al. introduce a system that transforms physically correct reflections by reflection constraints. The system introduces a taxonomy of reflection editing operations, using an intuitive user interface, that works directly on the reflecting surfaces with real-time visual feedback using a GPU. A user study shows how such a system can allow users to quickly manipulate reflections according to an art direction task.
Hierarchical Spatial Index Structures
A major factor for the efficiency of ray tracing is the use of good acceleration structures. While the exponential nature of the problem prohibits constructing optimal Bounding Volume Hierarchies (BVHs) for ray tracing for any meaningful scene, Popov et al. found a way to exploit the linearity of the surface area heuristic (SAH) to develop an algorithm that can find optimal partitions in polynomial time. A generalized version of this algorithm can be shown to encompass every SAH-based kd-tree or BVH construction algorithm as a special case. They also observed that enforcing space subdivision helps to improve BVH performance, which finally allowed for developing a simple space partitioning algorithm for building highly efficient BVHs. This work explores the theoretical foundations of this field and nicely complements our and others' previous papers on the topics.
While hierarchical kd-trees and BVHs are known to accelerate ray tracing well, they are still rather slow to build. In parallel work, Kalojanov et al. explored the use of GPUs to more quickly build indices for simple grids and, more recently, also for multi-level grids. A key feature of the approach is that its performance no longer depends on the primitive distribution. Instead, they reduce the problem to sorting pairs of primitives and cell indices. The implementation is able to take full advantage of the parallel architecture of the GPU, and construction speed is significantly improved.
High-Performance Algorithms
So far, researchers had focused on few specific and highly optimized combinations of data structures and algorithms when building real-time ray tracers. For the first time, the RTfact system developed by Georgiev et al. combined such high performance with the usual flexibility expected by a software solution. It uses a component oriented, generic, and portable design approach without sacrificing the performance benefits of hand-tuned single-purpose implementations using template meta-programming. Their generic design approach with loosely coupled algorithms and data structures allows for easy integration of new algorithms with maximum runtime performance, while leveraging as much of the existing code base as possible.
Today all rendering systems, including RTfact, still need shading languages to express the complicated features of real materials. Shaders are a key element to achieve realistically rendering models of the real world. However, these "plugins" must operate at the innermost loop of arbitrary renderers and are called tens of millions of times per second, requiring flexible transformations and optimization while being integrated with existing code. In joint work between Philipp Slusallek's group and the SIF group led by Sebastian Hack, Karrenberg et al. developed a completely new approach for a flexible but highly efficient shading system, AnySL. It uses an embedded compiler for flexible, non-standard optimizations, based on "subroutine threaded code" and "type replacement." For some architectures, they automatically perform fully automatic SIMD vectorization of entire functions, with performance improvements by a factor of 3.9 on average on SSE. In followup work (in submission) this has been extended into the first OpenCL compiler that fully exploits the SIMD of current CPUs. These results are instrumental for XML3D and the use of interactive 3D graphics on the Web, where portable material libraries are an essential ingredient. This work is closely related to joint research on new parallel languages with Reinhard Wilhelm. More recently, we are merging both lines of research to develop novel programming models and compiler techniques for high-performance rendering, image processing, and other domains.
Hierarchical and Sparse Object Representations
Joint research activities in this area are discussed in more detail under RA2.
Adaptive Computations
Currently 3D animation rendering and video compression are completely independent processes even if frames are compressed right after rendering for streaming purposes. In such a scenario, dynamic adjustment of the rendering quality to the dynamic requirements of a given client population can significantly improve performance. Herzog et al. presented a framework where the renderer and MPEG codec are coupled through a straightforward interface providing precise motion vectors from the rendering side to the codec and perceptual error thresholds for each pixel in the opposite direction. The perceptual error thresholds take into account bandwidth-dependent quantization errors resulting from the lossy compression as well as image content-dependent luminance and spatial contrast masking.
High-refresh-rate displays (e.g., 120Hz and more) have recently become available on the consumer market, reducing the perceived blur created by moving objects tracked by the human eye. Didyk et al. showed how rendered three-dimensional images produced by recent graphics hardware can be up-sampled in time more efficiently, simultaneously resulting in higher quality. The algorithm relies on several perceptual findings and preserves the naturalness of the original sequence. A psychophysical study validated their approach and illustrated that temporally up-sampled video streams are preferred over the standard low-rate input by the majority of users.
High Dynamic Range (HDR) Imaging has been a focus of our research for many years. It includes the handling and compression of HDR images, tone mapping, LDR expansion and many related techniques. A recent State-of-the-Art report summarizes many of these techniques.
Related research has focused on enhancing the contrast of displayed 3D scenes. In a study, Ihrke et al. investigated the perceptual impact of this field of technique in synthesized scenes. The algorithm extends traditional image-based unsharp masking to a 3D scene, achieving a scene-coherent enhancement. They conducted a standardized perceptual experiment to test the proposition that a 3D unsharp enhanced scene was superior to the original scene in terms of perceived contrast and preference, with very positive results.
Visual glare is a consequence of light scattered within the human eye when looking at bright light sources. Even though most, if not all, subjects report perceiving glare as a bright pattern that fluctuates in time, it had only been modeled as a static phenomenon. Ritschel et al. argue that the temporal properties of glare are a strong means to increase perceived brightness and to produce realistic and attractive renderings of bright light sources. This allows an improved depiction of HDR images on LDR media for interactive applications like games, feature films, or even by adding movement to initially static HDR images.
Algorithms for Large Data Sets
Large data sets often come from scanning real data sets. Wand et al. presented a new technique for reconstructing a single shape and its nonrigid motion from 3D scanning data. Its novel representation yields dense correspondences for the whole sequence, as well as a completed 3D shape in every frame. Their reconstruction framework is based upon a novel topology-aware adaptive subspace deformation technique that allows handling long sequences with complex geometry efficiently.
An alternative for the common use of triangles are voxels. Especially for complex entities, triangles have difficulties representing details convincingly and faithful approximations quickly become costly. Eisemann et al. proposed a new approach to efficiently render large volumetric data sets. The system achieves interactive to real-time rendering performance for several billion voxels and is based on an adaptive data representation depending on the current view and occlusion information, coupled to an efficient ray-casting rendering algorithm.
Molecular visualization is one of the cornerstones in structural bioinformatics and related fields. Marsalek et al. demonstrated in joint work between the groups of Andreas Hildebrand and Philipp Slusallek how real-time ray tracing integrated into a molecular modelling and visualization tool allows for better understanding of the complex structural arrangement of biomolecules in real time. Their technique naturally integrates into the full-featured molecular modelling and visualization tool BALLView, seamlessly extending a standard work flow with interactive high-quality rendering.
At the same time, molecular geometric properties, such as volume, exposed surface area, and occurrence of internal cavities, are important inputs for many applications in molecular modeling. In the same joint collaboration, we used high-performance ray casting for the computation of such quantities at interactive speed, even for models of huge molecular arrangements.