Decoding substance use disorder severity from clinical notes using a large language model

Abstract Substance use disorder (SUD) poses a major concern due to its detrimental effects on health and society. SUD identification and treatment depend on a variety of factors such as severity, co-determinants (e.g., withdrawal symptoms), and social determinants of health. Existing diagnostic codi...

Full description

Saved in:
Bibliographic Details
Main Authors: Maria Mahbub, Gregory M. Dams, Sudarshan Srinivasan, Caitlin Rizy, Ioana Danciu, Jodie Trafton, Kathryn Knight
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:npj Mental Health Research
Online Access:https://doi.org/10.1038/s44184-024-00114-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823861481388638208
author Maria Mahbub
Gregory M. Dams
Sudarshan Srinivasan
Caitlin Rizy
Ioana Danciu
Jodie Trafton
Kathryn Knight
author_facet Maria Mahbub
Gregory M. Dams
Sudarshan Srinivasan
Caitlin Rizy
Ioana Danciu
Jodie Trafton
Kathryn Knight
author_sort Maria Mahbub
collection DOAJ
description Abstract Substance use disorder (SUD) poses a major concern due to its detrimental effects on health and society. SUD identification and treatment depend on a variety of factors such as severity, co-determinants (e.g., withdrawal symptoms), and social determinants of health. Existing diagnostic coding systems used by insurance providers, like the International Classification of Diseases (ICD-10), lack granularity for certain diagnoses, but American clinicians will add this granularity (as that found within the Diagnostic and Statistical Manual of Mental Disorders classification or DSM-5) as supplemental unstructured text in clinical notes. Traditional natural language processing (NLP) methods face limitations in accurately parsing such diverse clinical language. Large language models (LLMs) offer promise in overcoming these challenges by adapting to diverse language patterns. This study investigates the application of LLMs for extracting severity-related information for various SUD diagnoses from clinical notes. We propose a workflow employing zero-shot learning of LLMs with carefully crafted prompts and post-processing techniques. Through experimentation with Flan-T5, an open-source LLM, we demonstrate its superior recall compared to the rule-based approach. Focusing on 11 categories of SUD diagnoses, we show the effectiveness of LLMs in extracting severity information, contributing to improved risk assessment and treatment planning for SUD patients.
format Article
id doaj-art-88ba39d7cb1b482da352aeeac6c8e679
institution Kabale University
issn 2731-4251
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series npj Mental Health Research
spelling doaj-art-88ba39d7cb1b482da352aeeac6c8e6792025-02-09T13:00:19ZengNature Portfolionpj Mental Health Research2731-42512025-02-014111010.1038/s44184-024-00114-6Decoding substance use disorder severity from clinical notes using a large language modelMaria Mahbub0Gregory M. Dams1Sudarshan Srinivasan2Caitlin Rizy3Ioana Danciu4Jodie Trafton5Kathryn Knight6Oak Ridge National LaboratoryProgram Evaluation and Resource Center, Office of Mental Health and Office of Suicide Prevention, Veterans Health Administration, Department of Veterans AffairsOak Ridge National LaboratoryOak Ridge National LaboratoryOak Ridge National LaboratoryProgram Evaluation and Resource Center, Office of Mental Health and Office of Suicide Prevention, Veterans Health Administration, Department of Veterans AffairsOak Ridge National LaboratoryAbstract Substance use disorder (SUD) poses a major concern due to its detrimental effects on health and society. SUD identification and treatment depend on a variety of factors such as severity, co-determinants (e.g., withdrawal symptoms), and social determinants of health. Existing diagnostic coding systems used by insurance providers, like the International Classification of Diseases (ICD-10), lack granularity for certain diagnoses, but American clinicians will add this granularity (as that found within the Diagnostic and Statistical Manual of Mental Disorders classification or DSM-5) as supplemental unstructured text in clinical notes. Traditional natural language processing (NLP) methods face limitations in accurately parsing such diverse clinical language. Large language models (LLMs) offer promise in overcoming these challenges by adapting to diverse language patterns. This study investigates the application of LLMs for extracting severity-related information for various SUD diagnoses from clinical notes. We propose a workflow employing zero-shot learning of LLMs with carefully crafted prompts and post-processing techniques. Through experimentation with Flan-T5, an open-source LLM, we demonstrate its superior recall compared to the rule-based approach. Focusing on 11 categories of SUD diagnoses, we show the effectiveness of LLMs in extracting severity information, contributing to improved risk assessment and treatment planning for SUD patients.https://doi.org/10.1038/s44184-024-00114-6
spellingShingle Maria Mahbub
Gregory M. Dams
Sudarshan Srinivasan
Caitlin Rizy
Ioana Danciu
Jodie Trafton
Kathryn Knight
Decoding substance use disorder severity from clinical notes using a large language model
npj Mental Health Research
title Decoding substance use disorder severity from clinical notes using a large language model
title_full Decoding substance use disorder severity from clinical notes using a large language model
title_fullStr Decoding substance use disorder severity from clinical notes using a large language model
title_full_unstemmed Decoding substance use disorder severity from clinical notes using a large language model
title_short Decoding substance use disorder severity from clinical notes using a large language model
title_sort decoding substance use disorder severity from clinical notes using a large language model
url https://doi.org/10.1038/s44184-024-00114-6
work_keys_str_mv AT mariamahbub decodingsubstanceusedisorderseverityfromclinicalnotesusingalargelanguagemodel
AT gregorymdams decodingsubstanceusedisorderseverityfromclinicalnotesusingalargelanguagemodel
AT sudarshansrinivasan decodingsubstanceusedisorderseverityfromclinicalnotesusingalargelanguagemodel
AT caitlinrizy decodingsubstanceusedisorderseverityfromclinicalnotesusingalargelanguagemodel
AT ioanadanciu decodingsubstanceusedisorderseverityfromclinicalnotesusingalargelanguagemodel
AT jodietrafton decodingsubstanceusedisorderseverityfromclinicalnotesusingalargelanguagemodel
AT kathrynknight decodingsubstanceusedisorderseverityfromclinicalnotesusingalargelanguagemodel