Towards Measuring and Mitigating Hallucinations in Generative Image Super-Resolution

Speaker: Raghav Goyal, Senior Researcher, Samsung Research America (SRA), Toronto
Date: 13 August 2025

Generative super-resolution (GSR) currently sets the state-of-the-art in terms of perceptual image quality, overcoming the ‘regression-to-the-mean’ blur of prior non-generative models. However, from a human perspective, such models do not fully conform to the optimal balance between quality and fidelity. Instead, a different class of artifacts, in which generated details fail to perceptually match the low-resolution image (LRI) or ground-truth image (GTI), is a critical but under-studied issue in GSR, limiting its practical deployments. In this talk, Raghav Goyal focused on measuring, analysing, and mitigating these artifacts (i.e., ‘hallucinations’). First, the hallucinations are analysed by observing that they are not well-characterised with existing image metrics or quality models, as they are orthogonal to both exact fidelity and no-reference quality. Second, to measure hallucinations, it was proposed to take advantage of a multimodal large language model (MLLM) that assesses hallucinatory visual elements and generates a ‘hallucination score’ (HS) which is closely aligned with human evaluations. Third, to mitigate hallucinations, it was found that certain deep feature distances have strong correlations with HS, and therefore it was proposed to align the GSR models by using such features as differentiable reward functions to mitigate hallucinations.