Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer

In automatic control systems, sensors and cameras are often used to capture images of the environment or processes being monitored. The quality of these images is paramount as it directly affects the system’s ability to accurately interpret and respond to the visual information. Image Qua...

Full description

Saved in:
Bibliographic Details
Main Authors: Geng Fu, Ziyu Wang, Cuijuan Zhang, Zerong Qi, Mingzheng Hu, Shujun Fu, Yunfeng Zhang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10845785/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823859606333423616
author Geng Fu
Ziyu Wang
Cuijuan Zhang
Zerong Qi
Mingzheng Hu
Shujun Fu
Yunfeng Zhang
author_facet Geng Fu
Ziyu Wang
Cuijuan Zhang
Zerong Qi
Mingzheng Hu
Shujun Fu
Yunfeng Zhang
author_sort Geng Fu
collection DOAJ
description In automatic control systems, sensors and cameras are often used to capture images of the environment or processes being monitored. The quality of these images is paramount as it directly affects the system’s ability to accurately interpret and respond to the visual information. Image Quality Assessment (IQA) is a crucial metric for intelligent control systems and computer vision tasks, such as surveillance, restoration, and fingerprint identification, significantly advancing algorithm development in these areas. Recently, transformer-based algorithms have excelled in computer vision, particularly in image classification, surpassing convolutional neural network (CNN) methods. To enhance IQA using transformers, we propose Swin-MIQT, a multi-scale spatial pooling transformer with shifted windows. As a no-reference (NR) IQA method, Swin-MIQT processes images at their original resolution without resizing or cropping, unlike standard vision transformers. By using shifted windows, we reduce computational load through efficient self-attention processing. Additionally, a spatial pyramid pooling layer captures diverse image quality information, improving IQA accuracy for distorted images. Comprehensive experiments show that Swin-MIQT achieves state-of-the-art performance on three synthetic distortion databases (LIVE, LIVE MD, TID2013) and competitive results on three authentic distortion databases (LIVE Challenge, KonIQ-10K, SPAQ). The outstanding performance demonstrates that Swin-MIQ possesses robust learning and generalization capabilities across all referenced distorted databases.
format Article
id doaj-art-4bea3fd11064496fb361d77f9e779bd0
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-4bea3fd11064496fb361d77f9e779bd02025-02-11T00:01:11ZengIEEEIEEE Access2169-35362025-01-0113242762428610.1109/ACCESS.2025.353141610845785Image Quality Assessment Based on Multi-Scale Representation and Shifting TransformerGeng Fu0https://orcid.org/0009-0004-8348-1598Ziyu Wang1Cuijuan Zhang2Zerong Qi3Mingzheng Hu4Shujun Fu5Yunfeng Zhang6https://orcid.org/0000-0002-1237-6035School of Computing and Artificial Intelligence, Shandong University of Finance and Economics, Jinan, ChinaDepartment of Interventional Therapy, Yidu Central Hospital of Weifang, Qingzhou, ChinaDepartment of Interventional Therapy, Yidu Central Hospital of Weifang, Qingzhou, ChinaShandong Chengshi Electronic Technology Company Ltd., Jinan, ChinaShandong Chengshi Electronic Technology Company Ltd., Jinan, ChinaSchool of Mathematics, Shandong University, Jinan, ChinaSchool of Computing and Artificial Intelligence, Shandong University of Finance and Economics, Jinan, ChinaIn automatic control systems, sensors and cameras are often used to capture images of the environment or processes being monitored. The quality of these images is paramount as it directly affects the system’s ability to accurately interpret and respond to the visual information. Image Quality Assessment (IQA) is a crucial metric for intelligent control systems and computer vision tasks, such as surveillance, restoration, and fingerprint identification, significantly advancing algorithm development in these areas. Recently, transformer-based algorithms have excelled in computer vision, particularly in image classification, surpassing convolutional neural network (CNN) methods. To enhance IQA using transformers, we propose Swin-MIQT, a multi-scale spatial pooling transformer with shifted windows. As a no-reference (NR) IQA method, Swin-MIQT processes images at their original resolution without resizing or cropping, unlike standard vision transformers. By using shifted windows, we reduce computational load through efficient self-attention processing. Additionally, a spatial pyramid pooling layer captures diverse image quality information, improving IQA accuracy for distorted images. Comprehensive experiments show that Swin-MIQT achieves state-of-the-art performance on three synthetic distortion databases (LIVE, LIVE MD, TID2013) and competitive results on three authentic distortion databases (LIVE Challenge, KonIQ-10K, SPAQ). The outstanding performance demonstrates that Swin-MIQ possesses robust learning and generalization capabilities across all referenced distorted databases.https://ieeexplore.ieee.org/document/10845785/Image quality assessmentmulti-scaleno-referencespatial poolingshifted windowtransformer
spellingShingle Geng Fu
Ziyu Wang
Cuijuan Zhang
Zerong Qi
Mingzheng Hu
Shujun Fu
Yunfeng Zhang
Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer
IEEE Access
Image quality assessment
multi-scale
no-reference
spatial pooling
shifted window
transformer
title Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer
title_full Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer
title_fullStr Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer
title_full_unstemmed Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer
title_short Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer
title_sort image quality assessment based on multi scale representation and shifting transformer
topic Image quality assessment
multi-scale
no-reference
spatial pooling
shifted window
transformer
url https://ieeexplore.ieee.org/document/10845785/
work_keys_str_mv AT gengfu imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer
AT ziyuwang imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer
AT cuijuanzhang imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer
AT zerongqi imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer
AT mingzhenghu imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer
AT shujunfu imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer
AT yunfengzhang imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer