A scalable multi-modal learning fruit detection algorithm for dynamic environments
IntroductionTo enhance the detection of litchi fruits in natural scenes, address challenges such as dense occlusion and small target identification, this paper proposes a novel multimodal target detection method, denoted as YOLOv5-Litchi.MethodsInitially, the Neck layer network of YOLOv5s is simplif...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-02-01
|
Series: | Frontiers in Neurorobotics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fnbot.2024.1518878/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1825206755224715264 |
---|---|
author | Liang Mao Liang Mao Zihao Guo Mingzhe Liu Yue Li Linlin Wang Jie Li |
author_facet | Liang Mao Liang Mao Zihao Guo Mingzhe Liu Yue Li Linlin Wang Jie Li |
author_sort | Liang Mao |
collection | DOAJ |
description | IntroductionTo enhance the detection of litchi fruits in natural scenes, address challenges such as dense occlusion and small target identification, this paper proposes a novel multimodal target detection method, denoted as YOLOv5-Litchi.MethodsInitially, the Neck layer network of YOLOv5s is simplified by changing its FPN+PAN structure to an FPN structure and increasing the number of detection heads from 3 to 5. Additionally, the detection heads with resolutions of 80 × 80 pixels and 160 × 160 pixels are replaced by TSCD detection heads to enhance the model's ability to detect small targets. Subsequently, the positioning loss function is replaced with the EIoU loss function, and the confidence loss is substituted by VFLoss to further improve the accuracy of the detection bounding box and reduce the missed detection rate in occluded targets. A sliding slice method is then employed to predict image targets, thereby reducing the miss rate of small targets.ResultsExperimental results demonstrate that the proposed model improves accuracy, recall, and mean average precision (mAP) by 9.5, 0.9, and 12.3 percentage points, respectively, compared to the original YOLOv5s model. When benchmarked against other models such as YOLOx, YOLOv6, and YOLOv8, the proposed model's AP value increases by 4.0, 6.3, and 3.7 percentage points, respectively.DiscussionThe improved network exhibits distinct improvements, primarily focusing on enhancing the recall rate and AP value, thereby reducing the missed detection rate which exhibiting a reduced number of missed targets and a more accurate prediction frame, indicating its suitability for litchi fruit detection. Therefore, this method significantly enhances the detection accuracy of mature litchi fruits and effectively addresses the challenges of dense occlusion and small target detection, providing crucial technical support for subsequent litchi yield estimation. |
format | Article |
id | doaj-art-db8d3ef546a349f0a13be41741363a78 |
institution | Kabale University |
issn | 1662-5218 |
language | English |
publishDate | 2025-02-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Neurorobotics |
spelling | doaj-art-db8d3ef546a349f0a13be41741363a782025-02-07T06:49:29ZengFrontiers Media S.A.Frontiers in Neurorobotics1662-52182025-02-011810.3389/fnbot.2024.15188781518878A scalable multi-modal learning fruit detection algorithm for dynamic environmentsLiang Mao0Liang Mao1Zihao Guo2Mingzhe Liu3Yue Li4Linlin Wang5Jie Li6Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence Application Technology Research Institute, Shenzhen Polytechnic University, Shenzhen, ChinaSchool of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, ChinaGuangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence Application Technology Research Institute, Shenzhen Polytechnic University, Shenzhen, ChinaSchool of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, ChinaSchool of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, ChinaGuangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence Application Technology Research Institute, Shenzhen Polytechnic University, Shenzhen, ChinaGuangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence Application Technology Research Institute, Shenzhen Polytechnic University, Shenzhen, ChinaIntroductionTo enhance the detection of litchi fruits in natural scenes, address challenges such as dense occlusion and small target identification, this paper proposes a novel multimodal target detection method, denoted as YOLOv5-Litchi.MethodsInitially, the Neck layer network of YOLOv5s is simplified by changing its FPN+PAN structure to an FPN structure and increasing the number of detection heads from 3 to 5. Additionally, the detection heads with resolutions of 80 × 80 pixels and 160 × 160 pixels are replaced by TSCD detection heads to enhance the model's ability to detect small targets. Subsequently, the positioning loss function is replaced with the EIoU loss function, and the confidence loss is substituted by VFLoss to further improve the accuracy of the detection bounding box and reduce the missed detection rate in occluded targets. A sliding slice method is then employed to predict image targets, thereby reducing the miss rate of small targets.ResultsExperimental results demonstrate that the proposed model improves accuracy, recall, and mean average precision (mAP) by 9.5, 0.9, and 12.3 percentage points, respectively, compared to the original YOLOv5s model. When benchmarked against other models such as YOLOx, YOLOv6, and YOLOv8, the proposed model's AP value increases by 4.0, 6.3, and 3.7 percentage points, respectively.DiscussionThe improved network exhibits distinct improvements, primarily focusing on enhancing the recall rate and AP value, thereby reducing the missed detection rate which exhibiting a reduced number of missed targets and a more accurate prediction frame, indicating its suitability for litchi fruit detection. Therefore, this method significantly enhances the detection accuracy of mature litchi fruits and effectively addresses the challenges of dense occlusion and small target detection, providing crucial technical support for subsequent litchi yield estimation.https://www.frontiersin.org/articles/10.3389/fnbot.2024.1518878/fullmulti-modal learningmachine learningfruit recognitiondeep learningobjective detection |
spellingShingle | Liang Mao Liang Mao Zihao Guo Mingzhe Liu Yue Li Linlin Wang Jie Li A scalable multi-modal learning fruit detection algorithm for dynamic environments Frontiers in Neurorobotics multi-modal learning machine learning fruit recognition deep learning objective detection |
title | A scalable multi-modal learning fruit detection algorithm for dynamic environments |
title_full | A scalable multi-modal learning fruit detection algorithm for dynamic environments |
title_fullStr | A scalable multi-modal learning fruit detection algorithm for dynamic environments |
title_full_unstemmed | A scalable multi-modal learning fruit detection algorithm for dynamic environments |
title_short | A scalable multi-modal learning fruit detection algorithm for dynamic environments |
title_sort | scalable multi modal learning fruit detection algorithm for dynamic environments |
topic | multi-modal learning machine learning fruit recognition deep learning objective detection |
url | https://www.frontiersin.org/articles/10.3389/fnbot.2024.1518878/full |
work_keys_str_mv | AT liangmao ascalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT liangmao ascalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT zihaoguo ascalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT mingzheliu ascalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT yueli ascalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT linlinwang ascalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT jieli ascalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT liangmao scalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT liangmao scalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT zihaoguo scalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT mingzheliu scalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT yueli scalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT linlinwang scalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments AT jieli scalablemultimodallearningfruitdetectionalgorithmfordynamicenvironments |