An IoT-enhanced automatic music composition system integrating audio-visual learning with transformer and SketchVAE
With the rapid development of artificial intelligence and the Internet of Things technology, the automatic music composition system has become a hot topic of research. This paper presents the TransVAE-Music composition system to achieve efficient multimodal data perception and fusion. Through the in...
Saved in:
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-02-01
|
Series: | Alexandria Engineering Journal |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1110016824012808 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1825206918313934848 |
---|---|
author | Yifei Zhang |
author_facet | Yifei Zhang |
author_sort | Yifei Zhang |
collection | DOAJ |
description | With the rapid development of artificial intelligence and the Internet of Things technology, the automatic music composition system has become a hot topic of research. This paper presents the TransVAE-Music composition system to achieve efficient multimodal data perception and fusion. Through the introduction of the Internet of Things technology, the system can collect and process audio, video and other data in real time, and improve the diversity and artistry of music generation. At the same time, the Bayesian optimization mechanism is used to finely adjust the hyperparameters in the system to further improve the model performance. Experimental results show that TransVAE-Music has 1.10 and 1.12 reconstruction errors on the POP909 and FMA datasets, respectively, which significantly outperforms other mainstream automatic music generation models. In addition, the model reached 4.8 and 4.9 in perceived quality score (PQS), and 4.4 and 4.5 in user satisfaction score (USS), respectively. These results demonstrate that the proposed system has significant advantages in terms of the accuracy of music generation and the user experience. This study not only provides an effective method for automatic music generation, but also provides important references for future studies on multimodal data fusion and high-quality music generation. |
format | Article |
id | doaj-art-012bd4a29a454aeb82565765c197e67c |
institution | Kabale University |
issn | 1110-0168 |
language | English |
publishDate | 2025-02-01 |
publisher | Elsevier |
record_format | Article |
series | Alexandria Engineering Journal |
spelling | doaj-art-012bd4a29a454aeb82565765c197e67c2025-02-07T04:46:57ZengElsevierAlexandria Engineering Journal1110-01682025-02-01113378390An IoT-enhanced automatic music composition system integrating audio-visual learning with transformer and SketchVAEYifei Zhang0Master’s student Department of Composition and Conducting, Shanghai Conservatory of Music, 200031, Shanghai, ChinaWith the rapid development of artificial intelligence and the Internet of Things technology, the automatic music composition system has become a hot topic of research. This paper presents the TransVAE-Music composition system to achieve efficient multimodal data perception and fusion. Through the introduction of the Internet of Things technology, the system can collect and process audio, video and other data in real time, and improve the diversity and artistry of music generation. At the same time, the Bayesian optimization mechanism is used to finely adjust the hyperparameters in the system to further improve the model performance. Experimental results show that TransVAE-Music has 1.10 and 1.12 reconstruction errors on the POP909 and FMA datasets, respectively, which significantly outperforms other mainstream automatic music generation models. In addition, the model reached 4.8 and 4.9 in perceived quality score (PQS), and 4.4 and 4.5 in user satisfaction score (USS), respectively. These results demonstrate that the proposed system has significant advantages in terms of the accuracy of music generation and the user experience. This study not only provides an effective method for automatic music generation, but also provides important references for future studies on multimodal data fusion and high-quality music generation.http://www.sciencedirect.com/science/article/pii/S1110016824012808Automatic music compositionMusic generationDeep learningAudio–visual learningInternet of things (IoT)Multimodal perception |
spellingShingle | Yifei Zhang An IoT-enhanced automatic music composition system integrating audio-visual learning with transformer and SketchVAE Alexandria Engineering Journal Automatic music composition Music generation Deep learning Audio–visual learning Internet of things (IoT) Multimodal perception |
title | An IoT-enhanced automatic music composition system integrating audio-visual learning with transformer and SketchVAE |
title_full | An IoT-enhanced automatic music composition system integrating audio-visual learning with transformer and SketchVAE |
title_fullStr | An IoT-enhanced automatic music composition system integrating audio-visual learning with transformer and SketchVAE |
title_full_unstemmed | An IoT-enhanced automatic music composition system integrating audio-visual learning with transformer and SketchVAE |
title_short | An IoT-enhanced automatic music composition system integrating audio-visual learning with transformer and SketchVAE |
title_sort | iot enhanced automatic music composition system integrating audio visual learning with transformer and sketchvae |
topic | Automatic music composition Music generation Deep learning Audio–visual learning Internet of things (IoT) Multimodal perception |
url | http://www.sciencedirect.com/science/article/pii/S1110016824012808 |
work_keys_str_mv | AT yifeizhang aniotenhancedautomaticmusiccompositionsystemintegratingaudiovisuallearningwithtransformerandsketchvae AT yifeizhang iotenhancedautomaticmusiccompositionsystemintegratingaudiovisuallearningwithtransformerandsketchvae |