Skip to main content

Research Repository

Advanced Search

Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers

Corona-Figueroa, Abril; Bond-Taylor, Sam; Bhowmik, Neelanjan; Gaus, Yona Falinie A.; Breckon, Toby P.; Shum, Hubert P.H.; Willcocks, Chris G.

Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers Thumbnail


Authors

Profile Image

Sam Bond-Taylor samuel.e.bond-taylor@durham.ac.uk
PGR Student Doctor of Philosophy



Abstract

Generating 3D images of complex objects conditionally from a few 2D views is a difficult synthesis problem, compounded by issues such as domain gap and geometric misalignment. For instance, a unified framework such as Generative Adversarial Networks cannot achieve this unless they explicitly define both a domain-invariant and geometric-invariant joint latent distribution, whereas Neural
Radiance Fields are generally unable to handle both issues as they optimize at the pixel level. By contrast, we propose a simple and novel 2D to 3D synthesis approach based on conditional diffusion with vector-quantized codes. Operating in an information-rich code space enables highresolution 3D synthesis via full-coverage attention across the views. Specifically, we generate the 3D codes, e.g. for CT images, conditional on previously generated 3D codes and the entire codebook of two 2D views (e.g. 2D X-rays). Qualitative and quantitative results demonstrate state-of-the-art performance over specialized methods across varied evaluation criteria, including fidelity metrics such as density and coverage and distortion metrics for two datasets of complex volumetric imagery found in real-world scenarios.

Citation

Corona-Figueroa, A., Bond-Taylor, S., Bhowmik, N., Gaus, Y. F. A., Breckon, T. P., Shum, H. P., & Willcocks, C. G. (2023). Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers. In ICCV '23: Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. https://doi.org/10.1109/ICCV51070.2023.01341

Conference Name ICCV23: 2023 IEEE/CVF International Conference on Computer Vision
Conference Location Paris, France
Start Date Oct 2, 2023
End Date Oct 6, 2023
Acceptance Date Jul 14, 2023
Online Publication Date Jan 15, 2024
Publication Date 2023
Deposit Date Aug 30, 2023
Publicly Available Date Dec 31, 2023
Publisher Institute of Electrical and Electronics Engineers
Series ISSN 1550-5499
Book Title ICCV '23: Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision
ISBN 9798350307191
DOI https://doi.org/10.1109/ICCV51070.2023.01341
Public URL https://durham-repository.worktribe.com/output/1726461
Publisher URL https://ieeexplore.ieee.org/xpl/conhome/1000149/all-proceedings

Files

Accepted Conference Paper (3.8 Mb)
PDF

Copyright Statement
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.




You might also like



Downloadable Citations