Xingyu Miao xingyu.miao@durham.ac.uk
PGR Student Doctor of Philosophy
DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume
Miao, Xingyu; Bai, Yang; Duan, Haoran; Huang, Yawen; Wan, Fan; Xu, Xinxing; Long, Yang; Zheng, Yefeng
Authors
Yang Bai yang.bai@durham.ac.uk
PGR Student Doctor of Philosophy
Haoran Duan haoran.duan@durham.ac.uk
PGR Student Doctor of Philosophy
Yawen Huang
Fan Wan fan.wan@durham.ac.uk
PGR Student Doctor of Philosophy
Xinxing Xu
Dr Yang Long yang.long@durham.ac.uk
Associate Professor
Yefeng Zheng
Abstract
Self-supervised monocular depth estimation methods typically rely on the reprojection error to capture geometric relationships between successive frames in static environments. However, this assumption does not hold in dynamic objects in scenarios, leading to errors during the view synthesis stage, such as feature mismatch and occlusion, which can significantly reduce the accuracy of the generated depth maps. To address this problem, we propose a novel dynamic cost volume that exploits residual optical flow to describe moving objects, improving incorrectly occluded regions in static cost volumes used in previous work. Nevertheless, the dynamic cost volume inevitably generates extra occlusions and noise, thus we alleviate this by designing a fusion module that makes static and dynamic cost volumes compensate for each other. In other words, occlusion from the static volume is refined by the dynamic volume, and incorrect information from the dynamic volume is eliminated by the static volume. Furthermore, we propose a pyramid distillation loss to reduce photometric error inaccuracy at low resolutions and an adaptive photometric error loss to alleviate the flow direction of the large gradient in the occlusion regions. We conducted extensive experiments on the KITTI and Cityscapes datasets, and the results demonstrate that our model outperforms previously published baselines for self-supervised monocular depth estimation.
Citation
Miao, X., Bai, Y., Duan, H., Huang, Y., Wan, F., Xu, X., Long, Y., & Zheng, Y. (2024). DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume. IEEE Transactions on Circuits and Systems for Video Technology, 34(4), 2564-2576. https://doi.org/10.1109/tcsvt.2023.3305776
Journal Article Type | Article |
---|---|
Acceptance Date | Aug 15, 2023 |
Online Publication Date | Aug 15, 2023 |
Publication Date | 2024-04 |
Deposit Date | Oct 6, 2023 |
Publicly Available Date | Oct 6, 2023 |
Journal | IEEE Transactions on Circuits and Systems for Video Technology |
Print ISSN | 1051-8215 |
Electronic ISSN | 1558-2205 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 34 |
Issue | 4 |
Pages | 2564-2576 |
DOI | https://doi.org/10.1109/tcsvt.2023.3305776 |
Public URL | https://durham-repository.worktribe.com/output/1758337 |
Files
Accepted Journal Article
(17.5 Mb)
PDF
Copyright Statement
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video
(2024)
Journal Article
EfficientTDNN: Efficient Architecture Search for Speaker Recognition
(2022)
Journal Article
Wearable-based behaviour interpolation for semi-supervised human activity recognition
(2024)
Journal Article
Dynamic Unary Convolution in Transformers
(2023)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search