Comparative analysis of generative LLMs for labeling entities in clinical notes

Abstract This paper evaluates and compares different fine-tuned variations of generative large language models (LLM) in the zero-shot named entity recognition (NER) task for the clinical domain. As part of the 8th Biomedical Linked Annotation Hackathon, we examined Llama 2 and Mistral models, includ...

Full description

Saved in:
Bibliographic Details
Main Authors: Rodrigo del Moral-González, Helena Gómez-Adorno, Orlando Ramos-Flores
Format: Article
Language:English
Published: BioMed Central 2025-02-01
Series:Genomics & Informatics
Subjects:
Online Access:https://doi.org/10.1186/s44342-024-00036-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823863420436348928
author Rodrigo del Moral-González
Helena Gómez-Adorno
Orlando Ramos-Flores
author_facet Rodrigo del Moral-González
Helena Gómez-Adorno
Orlando Ramos-Flores
author_sort Rodrigo del Moral-González
collection DOAJ
description Abstract This paper evaluates and compares different fine-tuned variations of generative large language models (LLM) in the zero-shot named entity recognition (NER) task for the clinical domain. As part of the 8th Biomedical Linked Annotation Hackathon, we examined Llama 2 and Mistral models, including base versions and those that have been fine-tuned for code, chat, and instruction-following tasks. We assess both the number of correctly identified entities and the models’ ability to retrieve entities in structured formats. We used a publicly available set of clinical cases labeled with mentions of diseases, symptoms, and medical procedures for the evaluation. Results show that instruction fine-tuned models perform better than chat fine-tuned and base models in recognizing entities. It is also shown that models perform better when simple output structures are requested.
format Article
id doaj-art-5300776c0ed1487bbdc52fac13928cd7
institution Kabale University
issn 2234-0742
language English
publishDate 2025-02-01
publisher BioMed Central
record_format Article
series Genomics & Informatics
spelling doaj-art-5300776c0ed1487bbdc52fac13928cd72025-02-09T12:09:11ZengBioMed CentralGenomics & Informatics2234-07422025-02-012311810.1186/s44342-024-00036-xComparative analysis of generative LLMs for labeling entities in clinical notesRodrigo del Moral-González0Helena Gómez-Adorno1Orlando Ramos-Flores2Posgrado en Ciencia e Ingeniería de la Computación, Universidad Nacional Autónoma de MéxicoInstituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de MéxicoInstituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de MéxicoAbstract This paper evaluates and compares different fine-tuned variations of generative large language models (LLM) in the zero-shot named entity recognition (NER) task for the clinical domain. As part of the 8th Biomedical Linked Annotation Hackathon, we examined Llama 2 and Mistral models, including base versions and those that have been fine-tuned for code, chat, and instruction-following tasks. We assess both the number of correctly identified entities and the models’ ability to retrieve entities in structured formats. We used a publicly available set of clinical cases labeled with mentions of diseases, symptoms, and medical procedures for the evaluation. Results show that instruction fine-tuned models perform better than chat fine-tuned and base models in recognizing entities. It is also shown that models perform better when simple output structures are requested.https://doi.org/10.1186/s44342-024-00036-xZero-shotNamed entity recognitionGenerative language modelsClinical domainBLAH8
spellingShingle Rodrigo del Moral-González
Helena Gómez-Adorno
Orlando Ramos-Flores
Comparative analysis of generative LLMs for labeling entities in clinical notes
Genomics & Informatics
Zero-shot
Named entity recognition
Generative language models
Clinical domain
BLAH8
title Comparative analysis of generative LLMs for labeling entities in clinical notes
title_full Comparative analysis of generative LLMs for labeling entities in clinical notes
title_fullStr Comparative analysis of generative LLMs for labeling entities in clinical notes
title_full_unstemmed Comparative analysis of generative LLMs for labeling entities in clinical notes
title_short Comparative analysis of generative LLMs for labeling entities in clinical notes
title_sort comparative analysis of generative llms for labeling entities in clinical notes
topic Zero-shot
Named entity recognition
Generative language models
Clinical domain
BLAH8
url https://doi.org/10.1186/s44342-024-00036-x
work_keys_str_mv AT rodrigodelmoralgonzalez comparativeanalysisofgenerativellmsforlabelingentitiesinclinicalnotes
AT helenagomezadorno comparativeanalysisofgenerativellmsforlabelingentitiesinclinicalnotes
AT orlandoramosflores comparativeanalysisofgenerativellmsforlabelingentitiesinclinicalnotes