Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer
In automatic control systems, sensors and cameras are often used to capture images of the environment or processes being monitored. The quality of these images is paramount as it directly affects the system’s ability to accurately interpret and respond to the visual information. Image Qua...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10845785/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823859606333423616 |
---|---|
author | Geng Fu Ziyu Wang Cuijuan Zhang Zerong Qi Mingzheng Hu Shujun Fu Yunfeng Zhang |
author_facet | Geng Fu Ziyu Wang Cuijuan Zhang Zerong Qi Mingzheng Hu Shujun Fu Yunfeng Zhang |
author_sort | Geng Fu |
collection | DOAJ |
description | In automatic control systems, sensors and cameras are often used to capture images of the environment or processes being monitored. The quality of these images is paramount as it directly affects the system’s ability to accurately interpret and respond to the visual information. Image Quality Assessment (IQA) is a crucial metric for intelligent control systems and computer vision tasks, such as surveillance, restoration, and fingerprint identification, significantly advancing algorithm development in these areas. Recently, transformer-based algorithms have excelled in computer vision, particularly in image classification, surpassing convolutional neural network (CNN) methods. To enhance IQA using transformers, we propose Swin-MIQT, a multi-scale spatial pooling transformer with shifted windows. As a no-reference (NR) IQA method, Swin-MIQT processes images at their original resolution without resizing or cropping, unlike standard vision transformers. By using shifted windows, we reduce computational load through efficient self-attention processing. Additionally, a spatial pyramid pooling layer captures diverse image quality information, improving IQA accuracy for distorted images. Comprehensive experiments show that Swin-MIQT achieves state-of-the-art performance on three synthetic distortion databases (LIVE, LIVE MD, TID2013) and competitive results on three authentic distortion databases (LIVE Challenge, KonIQ-10K, SPAQ). The outstanding performance demonstrates that Swin-MIQ possesses robust learning and generalization capabilities across all referenced distorted databases. |
format | Article |
id | doaj-art-4bea3fd11064496fb361d77f9e779bd0 |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-4bea3fd11064496fb361d77f9e779bd02025-02-11T00:01:11ZengIEEEIEEE Access2169-35362025-01-0113242762428610.1109/ACCESS.2025.353141610845785Image Quality Assessment Based on Multi-Scale Representation and Shifting TransformerGeng Fu0https://orcid.org/0009-0004-8348-1598Ziyu Wang1Cuijuan Zhang2Zerong Qi3Mingzheng Hu4Shujun Fu5Yunfeng Zhang6https://orcid.org/0000-0002-1237-6035School of Computing and Artificial Intelligence, Shandong University of Finance and Economics, Jinan, ChinaDepartment of Interventional Therapy, Yidu Central Hospital of Weifang, Qingzhou, ChinaDepartment of Interventional Therapy, Yidu Central Hospital of Weifang, Qingzhou, ChinaShandong Chengshi Electronic Technology Company Ltd., Jinan, ChinaShandong Chengshi Electronic Technology Company Ltd., Jinan, ChinaSchool of Mathematics, Shandong University, Jinan, ChinaSchool of Computing and Artificial Intelligence, Shandong University of Finance and Economics, Jinan, ChinaIn automatic control systems, sensors and cameras are often used to capture images of the environment or processes being monitored. The quality of these images is paramount as it directly affects the system’s ability to accurately interpret and respond to the visual information. Image Quality Assessment (IQA) is a crucial metric for intelligent control systems and computer vision tasks, such as surveillance, restoration, and fingerprint identification, significantly advancing algorithm development in these areas. Recently, transformer-based algorithms have excelled in computer vision, particularly in image classification, surpassing convolutional neural network (CNN) methods. To enhance IQA using transformers, we propose Swin-MIQT, a multi-scale spatial pooling transformer with shifted windows. As a no-reference (NR) IQA method, Swin-MIQT processes images at their original resolution without resizing or cropping, unlike standard vision transformers. By using shifted windows, we reduce computational load through efficient self-attention processing. Additionally, a spatial pyramid pooling layer captures diverse image quality information, improving IQA accuracy for distorted images. Comprehensive experiments show that Swin-MIQT achieves state-of-the-art performance on three synthetic distortion databases (LIVE, LIVE MD, TID2013) and competitive results on three authentic distortion databases (LIVE Challenge, KonIQ-10K, SPAQ). The outstanding performance demonstrates that Swin-MIQ possesses robust learning and generalization capabilities across all referenced distorted databases.https://ieeexplore.ieee.org/document/10845785/Image quality assessmentmulti-scaleno-referencespatial poolingshifted windowtransformer |
spellingShingle | Geng Fu Ziyu Wang Cuijuan Zhang Zerong Qi Mingzheng Hu Shujun Fu Yunfeng Zhang Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer IEEE Access Image quality assessment multi-scale no-reference spatial pooling shifted window transformer |
title | Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer |
title_full | Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer |
title_fullStr | Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer |
title_full_unstemmed | Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer |
title_short | Image Quality Assessment Based on Multi-Scale Representation and Shifting Transformer |
title_sort | image quality assessment based on multi scale representation and shifting transformer |
topic | Image quality assessment multi-scale no-reference spatial pooling shifted window transformer |
url | https://ieeexplore.ieee.org/document/10845785/ |
work_keys_str_mv | AT gengfu imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer AT ziyuwang imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer AT cuijuanzhang imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer AT zerongqi imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer AT mingzhenghu imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer AT shujunfu imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer AT yunfengzhang imagequalityassessmentbasedonmultiscalerepresentationandshiftingtransformer |