Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer

In automatic control systems, sensors and cameras are often used to capture images of the environment or processes being monitored. The quality of these images is paramount as it directly affects the system’s ability to accurately interpret and respond to the visual information. Image Qua...

Full description

Saved in:
Bibliographic Details
Main Authors: Geng Fu, Ziyu Wang, Cuijuan Zhang, Zerong Qi, Mingzheng Hu, Shujun Fu, Yunfeng Zhang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10845785/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In automatic control systems, sensors and cameras are often used to capture images of the environment or processes being monitored. The quality of these images is paramount as it directly affects the system’s ability to accurately interpret and respond to the visual information. Image Quality Assessment (IQA) is a crucial metric for intelligent control systems and computer vision tasks, such as surveillance, restoration, and fingerprint identification, significantly advancing algorithm development in these areas. Recently, transformer-based algorithms have excelled in computer vision, particularly in image classification, surpassing convolutional neural network (CNN) methods. To enhance IQA using transformers, we propose Swin-MIQT, a multi-scale spatial pooling transformer with shifted windows. As a no-reference (NR) IQA method, Swin-MIQT processes images at their original resolution without resizing or cropping, unlike standard vision transformers. By using shifted windows, we reduce computational load through efficient self-attention processing. Additionally, a spatial pyramid pooling layer captures diverse image quality information, improving IQA accuracy for distorted images. Comprehensive experiments show that Swin-MIQT achieves state-of-the-art performance on three synthetic distortion databases (LIVE, LIVE MD, TID2013) and competitive results on three authentic distortion databases (LIVE Challenge, KonIQ-10K, SPAQ). The outstanding performance demonstrates that Swin-MIQ possesses robust learning and generalization capabilities across all referenced distorted databases.
ISSN:2169-3536