Abril Corona Figueroa abril.corona-figueroa@durham.ac.uk
PGR Student Doctor of Philosophy
Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers
Corona-Figueroa, Abril; Bond-Taylor, Sam; Bhowmik, Neelanjan; Gaus, Yona Falinie A.; Breckon, Toby P.; Shum, Hubert P.H.; Willcocks, Chris G.
Authors
Samuel Bond-Taylor samuel.e.bond-taylor@durham.ac.uk
PGR Student Doctor of Philosophy
Dr Neelanjan Bhowmik neelanjan.bhowmik@durham.ac.uk
Post Doctoral Research Associate
Yona Binti Abd Gaus yona.f.binti-abd-gaus@durham.ac.uk
Post Doctoral Research Associate
Professor Toby Breckon toby.breckon@durham.ac.uk
Professor
Professor Hubert Shum hubert.shum@durham.ac.uk
Professor
Dr Chris Willcocks christopher.g.willcocks@durham.ac.uk
Associate Professor
Abstract
Generating 3D images of complex objects conditionally from a few 2D views is a difficult synthesis problem, compounded by issues such as domain gap and geometric misalignment. For instance, a unified framework such as Generative Adversarial Networks cannot achieve this unless they explicitly define both a domain-invariant and geometric-invariant joint latent distribution, whereas Neural
Radiance Fields are generally unable to handle both issues as they optimize at the pixel level. By contrast, we propose a simple and novel 2D to 3D synthesis approach based on conditional diffusion with vector-quantized codes. Operating in an information-rich code space enables highresolution 3D synthesis via full-coverage attention across the views. Specifically, we generate the 3D codes, e.g. for CT images, conditional on previously generated 3D codes and the entire codebook of two 2D views (e.g. 2D X-rays). Qualitative and quantitative results demonstrate state-of-the-art performance over specialized methods across varied evaluation criteria, including fidelity metrics such as density and coverage and distortion metrics for two datasets of complex volumetric imagery found in real-world scenarios.
Citation
Corona-Figueroa, A., Bond-Taylor, S., Bhowmik, N., Gaus, Y. F. A., Breckon, T. P., Shum, H. P., & Willcocks, C. G. (2023, October). Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers. Presented at ICCV23: 2023 IEEE/CVF International Conference on Computer Vision, Paris, France
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | ICCV23: 2023 IEEE/CVF International Conference on Computer Vision |
Start Date | Oct 2, 2023 |
End Date | Oct 6, 2023 |
Acceptance Date | Jul 14, 2023 |
Online Publication Date | Jan 15, 2024 |
Publication Date | 2023 |
Deposit Date | Aug 30, 2023 |
Publicly Available Date | Dec 31, 2023 |
Publisher | Institute of Electrical and Electronics Engineers |
Series ISSN | 1550-5499 |
Book Title | ICCV '23: Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision |
ISBN | 9798350307191 |
DOI | https://doi.org/10.1109/ICCV51070.2023.01341 |
Public URL | https://durham-repository.worktribe.com/output/1726461 |
Publisher URL | https://ieeexplore.ieee.org/xpl/conhome/1000149/all-proceedings |
Files
Accepted Conference Paper
(3.8 Mb)
PDF
Copyright Statement
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
MedNeRF: Medical Neural Radiance Fields for Reconstructing 3D-aware CT-Projections from a Single X-ray
(2022)
Presentation / Conference Contribution
Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling
(2024)
Presentation / Conference Contribution
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes
(2022)
Presentation / Conference Contribution
Shape tracing: An extension of sphere tracing for 3D non-convex collision in protein docking
(2020)
Presentation / Conference Contribution
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search