Chris Chen shuang.chen@durham.ac.uk
Post Doctoral Research Associate
Chris Chen shuang.chen@durham.ac.uk
Post Doctoral Research Associate
Dr Amir Atapour-Abarghouei amir.atapour-abarghouei@durham.ac.uk
Assistant Professor
Haozheng Zhang haozheng.zhang@durham.ac.uk
PGR Student Doctor of Philosophy
Professor Hubert Shum hubert.shum@durham.ac.uk
Professor
Image inpainting, or image completion, is a crucial task in computer vision that aims to restore missing or damaged regions of images with semantically coherent content. This technique requires a precise balance of local texture replication and global contextual understanding to ensure the restored image integrates seamlessly with its surroundings. Traditional methods using Convolutional Neural Networks (CNNs) are effective at capturing local patterns but often struggle with broader contextual relationships due to the limited receptive fields. Recent advancements have incorporated transformers, leveraging their ability to understand global interactions. However, these methods face computational inefficiencies and struggle to maintain fine-grained details. To overcome these challenges, we introduce MxT composed of the proposed Hybrid Module (HM), which combines Mamba with the transformer in a synergistic manner. Mamba is adept at efficiently processing long sequences with linear computational costs, making it an ideal complement to the transformer for handling long-scale data interactions. Our HM facilitates dual-level interaction learning at both pixel and patch levels, greatly enhancing the model to reconstruct images with high quality and contextual accuracy. We evaluate MxT on the widely-used CelebA-HQ and Places2-standard datasets, where it consistently outperformed existing state-of-the-art methods. The code will be released: https://github.com/ChrisChen1023/MxT.
Chen, S., Atapour-Abarghouei, A., Zhang, H., & Shum, H. P. H. (2024, November). MxT: Mamba x Transformer for Image Inpainting. Presented at BMVC 2024: The 35th British Machine Vision Conference, Glasgow, UK
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | BMVC 2024: The 35th British Machine Vision Conference |
Start Date | Nov 25, 2024 |
End Date | Nov 28, 2024 |
Acceptance Date | Aug 2, 2024 |
Publication Date | 2024 |
Deposit Date | Aug 5, 2024 |
Publicly Available Date | Dec 31, 2024 |
Peer Reviewed | Peer Reviewed |
Book Title | Proceedings of the 2024 British Machine Vision Conference |
Public URL | https://durham-repository.worktribe.com/output/2740846 |
Publisher URL | https://bmvc2024.org/proceedings/295/ |
Accepted Conference Paper
(9.5 Mb)
PDF
One-Index Vector Quantization Based Adversarial Attack on Image Classification
(2024)
Journal Article
Depth-Aware Endoscopic Video Inpainting
(2024)
Presentation / Conference Contribution
INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network
(2023)
Journal Article
A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip
(2022)
Presentation / Conference Contribution
HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention
(2024)
Journal Article
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search