Xingyu Miao xingyu.miao@durham.ac.uk
PGR Student Doctor of Philosophy
Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields
Miao, Xingyu; Duan, Haoran; Bai, Yang; Shah, Tejal; Song, Jun; Long, Yang; Ranjan, Rajiv; Shao, Ling
Authors
Haoran Duan
Yang Bai
Tejal Shah
Jun Song
Dr Yang Long yang.long@durham.ac.uk
Associate Professor
Rajiv Ranjan
Ling Shao
Abstract
In this work, we propose a method that leverages CLIP feature distillation, achieving efficient 3D segmentation through language guidance. Unlike previous methods that rely on multi-scale CLIP features and are limited by processing speed and storage requirements, our approach aims to streamline the workflow by directly and effectively distilling dense CLIP features, thereby achieving precise segmentation of 3D scenes using text. To achieve this, we introduce an adapter module and mitigate the noise issue in the dense CLIP feature distillation process through a self-cross-training strategy. Moreover, to enhance the accuracy of segmentation edges, this work presents a low-rank transient query attention mechanism. To ensure the consistency of segmentation for similar colors under different viewpoints, we convert the segmentation task into a classification task through label volume, which significantly improves the consistency of segmentation in color-similar areas. We also propose a simplified text augmentation strategy to alleviate the issue of ambiguity in the correspondence between CLIP features and text. Extensive experimental results show that our method surpasses current state-of-the-art technologies in both training speed and performance. Our code is available on: https://github.com/xingy038/Laser.git.
Citation
Miao, X., Duan, H., Bai, Y., Shah, T., Song, J., Long, Y., Ranjan, R., & Shao, L. (online). Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, https://doi.org/10.1109/TPAMI.2025.3535916
Journal Article Type | Article |
---|---|
Acceptance Date | Jan 1, 2025 |
Online Publication Date | Jan 29, 2025 |
Deposit Date | Mar 6, 2025 |
Publicly Available Date | Mar 11, 2025 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Print ISSN | 0162-8828 |
Electronic ISSN | 1939-3539 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
DOI | https://doi.org/10.1109/TPAMI.2025.3535916 |
Public URL | https://durham-repository.worktribe.com/output/3681375 |
Files
Accepted Journal Article
(57.1 Mb)
PDF
Licence
http://creativecommons.org/licenses/by/4.0/
Copyright Statement
This accepted manuscript is licensed under the Creative Commons Attribution 4.0 licence. https://creativecommons.org/licenses/by/4.0/
You might also like
DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume
(2023)
Journal Article
CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video
(2024)
Journal Article
Kernelized distance learning for zero-shot recognition
(2021)
Journal Article
A plug-in attribute correction module for generalized zero-shot learning
(2020)
Journal Article