Statistical and machine learning based platform-independent key genes identification for hepatocellular carcinoma.

Hepatocellular carcinoma (HCC) is the most prevalent and deadly form of liver cancer, and its mortality rate is gradually increasing worldwide. Existing studies used genetic datasets, taken from various platforms, but focused only on common differentially expressed genes (DEGs) across platforms. Con...

Full description

Saved in:
Bibliographic Details
Main Authors: Md Al Mehedi Hasan, Md Maniruzzaman, Jie Huang, Jungpil Shin
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0318215
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823861050433339392
author Md Al Mehedi Hasan
Md Maniruzzaman
Jie Huang
Jungpil Shin
author_facet Md Al Mehedi Hasan
Md Maniruzzaman
Jie Huang
Jungpil Shin
author_sort Md Al Mehedi Hasan
collection DOAJ
description Hepatocellular carcinoma (HCC) is the most prevalent and deadly form of liver cancer, and its mortality rate is gradually increasing worldwide. Existing studies used genetic datasets, taken from various platforms, but focused only on common differentially expressed genes (DEGs) across platforms. Consequently, these studies may missed some important genes in the investigation of HCC. To solve these problems, we have taken datasets from multiple platforms and designed a statistical and machine learning-based system to determine platform-independent key genes (KGs) for HCC patients. DEGs were determined from each dataset using limma. Individual combined DEGs (icDEGs) were identified from each platform and then determined grand combined DEGs (gcDEGs) from icDEGs of all platforms. Differentially expressed discriminative genes (DEDGs) was determined based on the classification accuracy using Support vector machine. We constructed PPI network on DEDGs and identified hub genes using MCC. This study determined the optimal modules using the MCODE scores of the PPI network and selected their gene combinations. We combined all genes, obtained from previous studies to form metadata, known as meta-hub genes. Finally, six KGs (CDC20, TOP2A, CENPF, DLGAP5, UBE2C, and RACGAP1) were selected by intersecting the overlapping hub genes, meta-hub genes, and hub module genes. The discriminative power of six KGs and their prognostic potentiality were evaluated using AUC and survival analysis.
format Article
id doaj-art-a72da38189c842058fe7ed62536fcd6e
institution Kabale University
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-a72da38189c842058fe7ed62536fcd6e2025-02-10T05:30:38ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01202e031821510.1371/journal.pone.0318215Statistical and machine learning based platform-independent key genes identification for hepatocellular carcinoma.Md Al Mehedi HasanMd ManiruzzamanJie HuangJungpil ShinHepatocellular carcinoma (HCC) is the most prevalent and deadly form of liver cancer, and its mortality rate is gradually increasing worldwide. Existing studies used genetic datasets, taken from various platforms, but focused only on common differentially expressed genes (DEGs) across platforms. Consequently, these studies may missed some important genes in the investigation of HCC. To solve these problems, we have taken datasets from multiple platforms and designed a statistical and machine learning-based system to determine platform-independent key genes (KGs) for HCC patients. DEGs were determined from each dataset using limma. Individual combined DEGs (icDEGs) were identified from each platform and then determined grand combined DEGs (gcDEGs) from icDEGs of all platforms. Differentially expressed discriminative genes (DEDGs) was determined based on the classification accuracy using Support vector machine. We constructed PPI network on DEDGs and identified hub genes using MCC. This study determined the optimal modules using the MCODE scores of the PPI network and selected their gene combinations. We combined all genes, obtained from previous studies to form metadata, known as meta-hub genes. Finally, six KGs (CDC20, TOP2A, CENPF, DLGAP5, UBE2C, and RACGAP1) were selected by intersecting the overlapping hub genes, meta-hub genes, and hub module genes. The discriminative power of six KGs and their prognostic potentiality were evaluated using AUC and survival analysis.https://doi.org/10.1371/journal.pone.0318215
spellingShingle Md Al Mehedi Hasan
Md Maniruzzaman
Jie Huang
Jungpil Shin
Statistical and machine learning based platform-independent key genes identification for hepatocellular carcinoma.
PLoS ONE
title Statistical and machine learning based platform-independent key genes identification for hepatocellular carcinoma.
title_full Statistical and machine learning based platform-independent key genes identification for hepatocellular carcinoma.
title_fullStr Statistical and machine learning based platform-independent key genes identification for hepatocellular carcinoma.
title_full_unstemmed Statistical and machine learning based platform-independent key genes identification for hepatocellular carcinoma.
title_short Statistical and machine learning based platform-independent key genes identification for hepatocellular carcinoma.
title_sort statistical and machine learning based platform independent key genes identification for hepatocellular carcinoma
url https://doi.org/10.1371/journal.pone.0318215
work_keys_str_mv AT mdalmehedihasan statisticalandmachinelearningbasedplatformindependentkeygenesidentificationforhepatocellularcarcinoma
AT mdmaniruzzaman statisticalandmachinelearningbasedplatformindependentkeygenesidentificationforhepatocellularcarcinoma
AT jiehuang statisticalandmachinelearningbasedplatformindependentkeygenesidentificationforhepatocellularcarcinoma
AT jungpilshin statisticalandmachinelearningbasedplatformindependentkeygenesidentificationforhepatocellularcarcinoma