Chris Chen shuang.chen@durham.ac.uk
Post Doctoral Research Associate
Chris Chen shuang.chen@durham.ac.uk
Post Doctoral Research Associate
Haozheng Zhang haozheng.zhang@durham.ac.uk
PGR Student Doctor of Philosophy
Dr Amir Atapour-Abarghouei amir.atapour-abarghouei@durham.ac.uk
Assistant Professor
Professor Hubert Shum hubert.shum@durham.ac.uk
Professor
Image inpainting aims to repair a partially damaged image based on the information from known regions of the images. Achieving semantically plausible inpainting results is particularly challenging because it requires the reconstructed regions to exhibit similar patterns to the semanticly consistent regions. This requires a model with a strong capacity to capture long-range dependencies. Existing models struggle in this regard due to the slow growth of receptive field for Convolutional Neural Networks (CNNs) based methods and patch-level interactions in Transformer-based methods, which are ineffective for capturing long-range dependencies. Motivated by this, we propose SEM-Net, a novel visual State Space model (SSM) vision network, modelling corrupted images at the pixel level while capturing long-range dependencies (LRDs) in state space, achieving a linear computational complexity. To address the inherent lack of spatial awareness in SSM, we introduce the Snake Mamba Block (SMB) and Spatially-Enhanced Feed-forward Network. These innovations enable SEM-Net to outperform state-of-the-art inpainting methods on two distinct datasets, showing significant improvements in capturing LRDs and enhancement in spatial consistency. Additionally, SEM-Net achieves state-of-the-art performance on motion deblurring, demonstrating its generalizability. Our source code is available: https://github.com/ChrisChenl023/SEM-Net.
Chen, S., Zhang, H., Atapour-Abarghouei, A., & Shum, H. P. H. (2025, February). SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM. Presented at 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, Arizona
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) |
Start Date | Feb 26, 2025 |
End Date | Mar 6, 2025 |
Acceptance Date | Oct 28, 2024 |
Online Publication Date | Apr 8, 2025 |
Publication Date | Apr 8, 2025 |
Deposit Date | Nov 11, 2024 |
Publicly Available Date | Apr 8, 2025 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Pages | 461-471 |
Series ISSN | 2472-6737 |
Book Title | Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision |
DOI | https://doi.org/10.1109/WACV61041.2025.00055 |
Public URL | https://durham-repository.worktribe.com/output/3091371 |
Accepted Conference Paper
(11 Mb)
PDF
One-Index Vector Quantization Based Adversarial Attack on Image Classification
(2024)
Journal Article
Depth-Aware Endoscopic Video Inpainting
(2024)
Presentation / Conference Contribution
INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network
(2023)
Journal Article
A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip
(2022)
Presentation / Conference Contribution
HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention
(2024)
Journal Article
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search