Skip to main content

Research Repository

Advanced Search

Outputs (430)

Is Unimodal Bias Always Bad for Visual Question Answering? A Medical Domain Study with Dynamic Attention (2022)
Presentation / Conference Contribution

Medical visual question answering (Med-VQA) is to answer medical questions based on clinical images provided. This field is still in its infancy due to the complexity of the trio formed of questions, multimodal features and expert knowledge. In this... Read More about Is Unimodal Bias Always Bad for Visual Question Answering? A Medical Domain Study with Dynamic Attention.

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes (2022)
Presentation / Conference Contribution

Whilst diffusion probabilistic models can generate high quality image content, key limitations remain in terms of both generating high-resolution imagery and their associated high computational requirements. Recent Vector-Quantized image models have... Read More about Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes.

STIT: Spatio-Temporal Interaction Transformers for Human-Object Interaction Recognition in Videos (2022)
Presentation / Conference Contribution

Recognizing human-object interactions is challenging due to their spatio-temporal changes. We propose the SpatioTemporal Interaction Transformer-based (STIT) network to reason such changes. Specifically, spatial transformers learn humans and objects... Read More about STIT: Spatio-Temporal Interaction Transformers for Human-Object Interaction Recognition in Videos.

A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip (2022)
Presentation / Conference Contribution

A Cleft lip is a congenital abnormality requiring surgical repair by a specialist. The surgeon must have extensive experience and theoretical knowledge to perform surgery, and Artificial Intelligence (AI) method has been proposed to guide surgeons in... Read More about A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip.

Detecting Melanoma Fairly: Skin Tone Detection and Debiasing for Skin Lesion Classification (2022)
Presentation / Conference Contribution

Convolutional Neural Networks have demonstrated human-level performance in the classification of melanoma and other skin lesions, but evident performance disparities between differing skin tones should be addressed before widespread deployment. In th... Read More about Detecting Melanoma Fairly: Skin Tone Detection and Debiasing for Skin Lesion Classification.