VersatileGaussian: Real-time Neural Rendering for Versatile Tasks using Gaussian Splatting

Renjie Li1, Zhiwen Fan2†, Bohua Wang3, Peihao Wang2, Zhangyang Wang2, Xi Wu4
1THU, 2UT Austin, 3Baidu, 4CUIT
Project Lead
ECCV 2024, Milan

VersatileGaussian unifies Gaussian Splatting and Multi-task Learning for representing versatile label maps, and can operate in real-time (35FPS) while rendering any viewpoint in a large-scale scene (left). VersatileGaussian achieves optimal performance in multi-task metrics and rendering efficiency (right) for synthesizing various label maps.


Abstract

The acquisition of multi-task (MT) labels in 3D scenes is crucial for a wide range of real-world applications. Traditional methods generally employ an analysis-by-synthesis approach, generating 2D label maps on novel synthesized views, or utilize Neural Radiance Field (NeRF), which concurrently represents label maps. Yet, these approaches often struggle to balance inference efficiency with MT label quality. Specifically, they face limitations such as (a) constrained rendering speeds due to NeRF pipelines, and (b) the implicit representation of MT fields that can result in continuity artifacts during rendering. Recently, 3D Gaussian Splatting has shown promise in achieving real-time rendering speeds without compromising rendering quality. In our research, we address the challenge of enabling 3D Gaussian Splatting to represent Versatile MT labels. Simply attaching MT attributes to explicit Gaussians compromises rendering quality due to the lack of cross-task information flow during optimization. We introduce architectural and rasterizer design to effectively overcome this issue. Our VersatileGaussian model innovatively associates Gaussians with shared MT features and incorporates a feature map rasterizer. The cornerstone of this versatile rasterization is the Task Correlation Attention module, which fosters cross-task correlations through a soft weighting mechanism that disseminates task-specific knowledge. Across experiments on the ScanNet and Replica datasets shows that VersatileGaussian not only sets a new benchmark in MT accuracy but also maintains real-time rendering speeds (35 FPS). Importantly, this model design facilitates mutual benefits across tasks, leading to improved quality in novel view synthesis.


Overall Pipeline

The pipeline of VersatileGaussian. VersatileGaussian represents versatile labels as view-direction-dependent and view-direction-independent features on the Gaussians. After fast rasterization of the feature maps, a Task Correlation Attention module is used to facilitate task information flow, contributing to better render quality.


Rendered Videos

RGB

Reshading

Edge

Semantic

Surface Normal

Keypoint

RGB

Reshading

Edge

Semantic

Surface Normal

Keypoint


RGB

Reshading

Edge

Semantic

Surface Normal

Keypoint

RGB

Reshading

Edge

Semantic

Surface Normal

Keypoint

BibTeX