Bio-primed machine learning to enhance discovery of relevant biomarkers

Abstract Precision medicine relies on identifying reliable biomarkers for gene dependencies to tailor individualized therapeutic strategies. The advent of high-throughput technologies presents unprecedented opportunities to explore molecular disease mechanisms but also challenges due to high dimensi...

Full description

Saved in:
Bibliographic Details
Main Authors: David M. Henke, Alexander Renwick, Joseph R. Zoeller, Jitendra K. Meena, Nicholas J. Neill, Elizabeth A. Bowling, Kristen L. Meerbrey, Thomas F. Westbrook, Lukas M. Simon
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:npj Precision Oncology
Online Access:https://doi.org/10.1038/s41698-025-00825-9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823863440512385024
author David M. Henke
Alexander Renwick
Joseph R. Zoeller
Jitendra K. Meena
Nicholas J. Neill
Elizabeth A. Bowling
Kristen L. Meerbrey
Thomas F. Westbrook
Lukas M. Simon
author_facet David M. Henke
Alexander Renwick
Joseph R. Zoeller
Jitendra K. Meena
Nicholas J. Neill
Elizabeth A. Bowling
Kristen L. Meerbrey
Thomas F. Westbrook
Lukas M. Simon
author_sort David M. Henke
collection DOAJ
description Abstract Precision medicine relies on identifying reliable biomarkers for gene dependencies to tailor individualized therapeutic strategies. The advent of high-throughput technologies presents unprecedented opportunities to explore molecular disease mechanisms but also challenges due to high dimensionality and collinearity among features. Traditional statistical methods often fall short in this context, necessitating novel computational approaches that harness the full potential of big data in bioinformatics. Here, we introduce a novel machine learning approach extending the Least Absolute Shrinkage and Selection Operator (LASSO) regression framework to incorporate biological knowledge, such as protein-protein interaction databases, into the regularization process. This bio-primed approach prioritizes variables that are both statistically significant and biologically relevant. Applying our method to multiple dependency datasets, we identified biomarkers which traditional methods overlooked. Our biologically informed LASSO method effectively identifies relevant biomarkers from high-dimensional collinear data, bridging the gap between statistical rigor and biological insight. This method holds promise for advancing personalized medicine by uncovering novel therapeutic targets and understanding the complex interplay of genetic and molecular factors in disease.
format Article
id doaj-art-d436801f29e44bffab4adc679525956f
institution Kabale University
issn 2397-768X
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series npj Precision Oncology
spelling doaj-art-d436801f29e44bffab4adc679525956f2025-02-09T12:09:28ZengNature Portfolionpj Precision Oncology2397-768X2025-02-019111010.1038/s41698-025-00825-9Bio-primed machine learning to enhance discovery of relevant biomarkersDavid M. Henke0Alexander Renwick1Joseph R. Zoeller2Jitendra K. Meena3Nicholas J. Neill4Elizabeth A. Bowling5Kristen L. Meerbrey6Thomas F. Westbrook7Lukas M. Simon8Molecular Virology & Microbiology, Baylor College of MedicineDepartment of Statistics, Rice UniversityMedical Scientist Training Program, Baylor College of MedicineVerna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of MedicineVerna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of MedicineVerna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of MedicineVerna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of MedicineVerna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of MedicineVerna and Marrs McLean Department of Biochemistry and Molecular Pharmacology, Baylor College of MedicineAbstract Precision medicine relies on identifying reliable biomarkers for gene dependencies to tailor individualized therapeutic strategies. The advent of high-throughput technologies presents unprecedented opportunities to explore molecular disease mechanisms but also challenges due to high dimensionality and collinearity among features. Traditional statistical methods often fall short in this context, necessitating novel computational approaches that harness the full potential of big data in bioinformatics. Here, we introduce a novel machine learning approach extending the Least Absolute Shrinkage and Selection Operator (LASSO) regression framework to incorporate biological knowledge, such as protein-protein interaction databases, into the regularization process. This bio-primed approach prioritizes variables that are both statistically significant and biologically relevant. Applying our method to multiple dependency datasets, we identified biomarkers which traditional methods overlooked. Our biologically informed LASSO method effectively identifies relevant biomarkers from high-dimensional collinear data, bridging the gap between statistical rigor and biological insight. This method holds promise for advancing personalized medicine by uncovering novel therapeutic targets and understanding the complex interplay of genetic and molecular factors in disease.https://doi.org/10.1038/s41698-025-00825-9
spellingShingle David M. Henke
Alexander Renwick
Joseph R. Zoeller
Jitendra K. Meena
Nicholas J. Neill
Elizabeth A. Bowling
Kristen L. Meerbrey
Thomas F. Westbrook
Lukas M. Simon
Bio-primed machine learning to enhance discovery of relevant biomarkers
npj Precision Oncology
title Bio-primed machine learning to enhance discovery of relevant biomarkers
title_full Bio-primed machine learning to enhance discovery of relevant biomarkers
title_fullStr Bio-primed machine learning to enhance discovery of relevant biomarkers
title_full_unstemmed Bio-primed machine learning to enhance discovery of relevant biomarkers
title_short Bio-primed machine learning to enhance discovery of relevant biomarkers
title_sort bio primed machine learning to enhance discovery of relevant biomarkers
url https://doi.org/10.1038/s41698-025-00825-9
work_keys_str_mv AT davidmhenke bioprimedmachinelearningtoenhancediscoveryofrelevantbiomarkers
AT alexanderrenwick bioprimedmachinelearningtoenhancediscoveryofrelevantbiomarkers
AT josephrzoeller bioprimedmachinelearningtoenhancediscoveryofrelevantbiomarkers
AT jitendrakmeena bioprimedmachinelearningtoenhancediscoveryofrelevantbiomarkers
AT nicholasjneill bioprimedmachinelearningtoenhancediscoveryofrelevantbiomarkers
AT elizabethabowling bioprimedmachinelearningtoenhancediscoveryofrelevantbiomarkers
AT kristenlmeerbrey bioprimedmachinelearningtoenhancediscoveryofrelevantbiomarkers
AT thomasfwestbrook bioprimedmachinelearningtoenhancediscoveryofrelevantbiomarkers
AT lukasmsimon bioprimedmachinelearningtoenhancediscoveryofrelevantbiomarkers