LLM-Guided Crowdsourced Test Report Clustering
This paper proposes a clustering method for crowdsourced test reports based on a large language model to solve the limitations of existing methods in processing repeated reports and utilizing multi-modal information. Existing crowdsourced test report clustering methods have significant shortcomings...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10844085/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper proposes a clustering method for crowdsourced test reports based on a large language model to solve the limitations of existing methods in processing repeated reports and utilizing multi-modal information. Existing crowdsourced test report clustering methods have significant shortcomings in handling duplicate reports, ignoring the semantic information of screenshots, and underutilizing the relationship between text and images. The emergence of LLM provides a new way to solve these problems. By integrating the semantic understanding ability of LLM, key information can be extracted from the test report more accurately, and the semantic relationship between screenshots and text descriptions can be used to guide the clustering process, thus improving the accuracy and effectiveness of clustering. The method in this paper uses a pre-trained LLM (such as GPT-4) to encode the text in the test report, and uses a visual model such as CLIP to encode the application screenshots, converting the text descriptions and images into high-dimensional semantic vectors. The cosine similarity is then used to calculate the similarity between the vectors, and semantic binding rules are constructed to guide the clustering process, ensuring that semantically related reports are assigned to the same cluster and semantically different reports are assigned to different clusters. Through experimental verification, this method is significantly superior to traditional methods in several evaluation indicators, demonstrating its great potential in improving the efficiency and quality of crowdsourced test report processing. In the future, this method is expected to be widely used in the process of software testing and maintenance, and further promote technological progress. |
---|---|
ISSN: | 2169-3536 |