Dr Amir Atapour-Abarghouei amir.atapour-abarghouei@durham.ac.uk
Assistant Professor
To complete or to estimate, that is the question: A Multi-Task Depth Completion and Monocular Depth Estimation
Atapour-Abarghouei, Amir; Breckon, Toby P.
Authors
Professor Toby Breckon toby.breckon@durham.ac.uk
Professor
Abstract
Robust three-dimensional scene understanding is now an ever-growing area of research highly relevant in many real-world applications such as autonomous driving and robotic navigation. In this paper, we propose a multi-task learning-based model capable of performing two tasks:- sparse depth completion (i.e. generating complete dense scene depth given a sparse depth image as the input) and monocular depth estimation (i.e. predicting scene depth from a single RGB image) via two sub-networks jointly trained end to end using data randomly sampled from a publicly available corpus of synthetic and real-world images. The first sub-network generates a sparse depth image by learning lower level features from the scene and the second predicts a full dense depth image of the entire scene, leading to a better geometric and contextual understanding of the scene and, as a result, superior performance of the approach. The entire model can be used to infer complete scene depth from a single RGB image or the second network can be used alone to perform depth completion given a sparse depth input. Using adversarial training, a robust objective function, a deep architecture relying on skip connections and a blend of synthetic and real-world training data, our approach is capable of producing superior high quality scene depth. Extensive experimental evaluation demonstrates the efficacy of our approach compared to contemporary state-of-the-art techniques across both problem domains.
Citation
Atapour-Abarghouei, A., & Breckon, T. P. (2019, September). To complete or to estimate, that is the question: A Multi-Task Depth Completion and Monocular Depth Estimation. Presented at International Conference on 3D Vision, Quebec
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | International Conference on 3D Vision |
Start Date | Sep 16, 2019 |
End Date | Sep 19, 2019 |
Acceptance Date | Jul 30, 2019 |
Publication Date | Sep 1, 2019 |
Deposit Date | Aug 14, 2019 |
Publicly Available Date | Nov 12, 2019 |
Pages | 183-193 |
Series ISSN | 2475-7888 |
Book Title | Proceedings of 2019 International Conference on 3D Vision (3DV) |
DOI | https://doi.org/10.1109/3dv.2019.00029 |
Public URL | https://durham-repository.worktribe.com/output/1141730 |
Files
Accepted Conference Proceeding
(8.8 Mb)
PDF
Copyright Statement
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention
(2024)
Journal Article
INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network
(2023)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search