Identification of LSA Data Retrieval Method and Temporal Graph for Document Retrieval
The field of expert finding has seen a large number of approaches proposed both in universities and in industries, using a variety of new techniques in relevant data fields. This study tends to identify information retrieval method of latent semantic analysis and temporal graph for document retrieva...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
University North
2025-01-01
|
Series: | Tehnički Glasnik |
Subjects: | |
Online Access: | https://hrcak.srce.hr/file/473450 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The field of expert finding has seen a large number of approaches proposed both in universities and in industries, using a variety of new techniques in relevant data fields. This study tends to identify information retrieval method of latent semantic analysis and temporal graph for document retrieval. In this study, citation occurrence and author occurrence are independent variables and scales of expert author finding are dependent variables. The method used to evaluate judgment of document and author relevance in the test set formation phase is more similar to survey methods. Library method is used to study theoretical foundations and judge literature. This study has three populations: a) test set documents; b) people who make queries and judge relevance of retrieved documents; c) people who judge relevance of the retrieved experts. To measure judgments of document relevance, a method similar to peer tests is used. Among the retrieved results, repeated results are placed to determine accuracy and reliability of the judge. The degree of correlation obtained in this method is very high (0.98), indicating the reliability of the results. Regarding the results of the current study on application of latent semantic indexing (LSA) information retrieval model, which was ultimately used to retrieve expert authors, the performance of LSA-based retrieval model outperformed the baseline model. This was evident from the obtained metrics, including precision at the top 5 results (p@5) with a value of 0.895, mean average precision (MAP) of 0.839, and mean reciprocal rank (MRR) of 0.909. The improved retrieval performance can be attributed to the superior performance of the dimension reduction method compared to keyword matching. |
---|---|
ISSN: | 1846-6168 1848-5588 |