mbctools: A User-Friendly Metabarcoding and Cross-Platform Pipeline for Analyzing Multiple Amplicon Sequencing Data across a Large Diversity of Organisms

We developed a python package called mbctools, designed to offer a cross-platform tool for processing amplicon data from various organisms in the context of metabarcoding studies. It can handle the most common tasks in metabarcoding pipelines such as paired-end merging, primer trimming, quality filt...

Full description

Saved in:
Bibliographic Details
Main Authors: Barnabé, Christian, Sempéré, Guilhem, Manzanilla, Vincent, Millan, Joel Moo, Amblard-Rambert, Antoine, Waleckx, Etienne
Format: Article
Language:English
Published: Peer Community In 2024-12-01
Series:Peer Community Journal
Subjects:
Online Access:https://peercommunityjournal.org/articles/10.24072/pcjournal.501/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825206418155765760
author Barnabé, Christian
Sempéré, Guilhem
Manzanilla, Vincent
Millan, Joel Moo
Amblard-Rambert, Antoine
Waleckx, Etienne
author_facet Barnabé, Christian
Sempéré, Guilhem
Manzanilla, Vincent
Millan, Joel Moo
Amblard-Rambert, Antoine
Waleckx, Etienne
author_sort Barnabé, Christian
collection DOAJ
description We developed a python package called mbctools, designed to offer a cross-platform tool for processing amplicon data from various organisms in the context of metabarcoding studies. It can handle the most common tasks in metabarcoding pipelines such as paired-end merging, primer trimming, quality filtering, sequence denoising, zero-radius operational taxonomic unit (ZOTU) filtering, and has the capability to process multiple genetic markers simultaneously. mbctools is a menu-driven program that eliminates the need for expertise in command-line skills and ensures documentation of each analysis for reproducibility purposes. The software, designed to run in a console, offers an interactive experience, guided by keyboard inputs, assisting users along the way through data processing and hiding the complexity of command lines by letting them concentrate on selecting parameters to apply in each step of the process. In our workflow, VSEARCH is utilized for processing fastq files derived from amplicon-based Next-Generation Sequencing data. This software is a versatile open-source tool for processing amplicon sequences, offering advantages such as high speed, efficient memory usage, and the ability to handle large datasets. It provides functions for various tasks such as dereplication, clustering, chimera detection, and taxonomic assignment. VSEARCH is thus very efficient in retrieving the overall diversity of a sample. To adapt to the diversity of projects in metabarcoding, we facilitate the reprocessing of datasets with the possibility to adjust parameters. mbctools can also be launched in a headless mode, making it suited for integration into pipelines running on High-Performance Computing environments. mbctools is available at https://github.com/GuilhemSempere/mbctools, https://pypi.org/project/mbctools/.
format Article
id doaj-art-99ead1c3c12541d280097dfee32a0d42
institution Kabale University
issn 2804-3871
language English
publishDate 2024-12-01
publisher Peer Community In
record_format Article
series Peer Community Journal
spelling doaj-art-99ead1c3c12541d280097dfee32a0d422025-02-07T10:17:17ZengPeer Community InPeer Community Journal2804-38712024-12-01410.24072/pcjournal.50110.24072/pcjournal.501mbctools: A User-Friendly Metabarcoding and Cross-Platform Pipeline for Analyzing Multiple Amplicon Sequencing Data across a Large Diversity of Organisms Barnabé, Christian0https://orcid.org/0000-0003-1857-0741Sempéré, Guilhem1https://orcid.org/0000-0001-7429-2091Manzanilla, Vincent2https://orcid.org/0000-0002-0816-163XMillan, Joel Moo3https://orcid.org/0000-0002-5901-7689Amblard-Rambert, Antoine4Waleckx, Etienne5https://orcid.org/0000-0002-3270-6476Institut de Recherche pour le Développement, UMR INTERTRYP IRD, CIRAD, Université de Montpellier, Montpellier, FranceCentre de Coopération Internationale en Recherche Agronomique pour le Développement, UMR INTERTRYP IRD, CIRAD, Université de Montpellier, Montpellier, France; South Green Bioinformatics Platform, Biodiversity, Montpellier, FranceInstitut de Recherche pour le Développement, UMR INTERTRYP IRD, CIRAD, Université de Montpellier, Montpellier, FranceLaboratorio de Parasitología, Centro de Investigaciones Regionales “Dr Hideyo Noguchi”, Universidad Autónoma de Yucatán, Mérida, Yucatán, MéxicoInstitut de Recherche pour le Développement, UMR INTERTRYP IRD, CIRAD, Université de Montpellier, Montpellier, France; Laboratorio de Parasitología, Centro de Investigaciones Regionales “Dr Hideyo Noguchi”, Universidad Autónoma de Yucatán, Mérida, Yucatán, MéxicoInstitut de Recherche pour le Développement, UMR INTERTRYP IRD, CIRAD, Université de Montpellier, Montpellier, France; Laboratorio de Parasitología, Centro de Investigaciones Regionales “Dr Hideyo Noguchi”, Universidad Autónoma de Yucatán, Mérida, Yucatán, MéxicoWe developed a python package called mbctools, designed to offer a cross-platform tool for processing amplicon data from various organisms in the context of metabarcoding studies. It can handle the most common tasks in metabarcoding pipelines such as paired-end merging, primer trimming, quality filtering, sequence denoising, zero-radius operational taxonomic unit (ZOTU) filtering, and has the capability to process multiple genetic markers simultaneously. mbctools is a menu-driven program that eliminates the need for expertise in command-line skills and ensures documentation of each analysis for reproducibility purposes. The software, designed to run in a console, offers an interactive experience, guided by keyboard inputs, assisting users along the way through data processing and hiding the complexity of command lines by letting them concentrate on selecting parameters to apply in each step of the process. In our workflow, VSEARCH is utilized for processing fastq files derived from amplicon-based Next-Generation Sequencing data. This software is a versatile open-source tool for processing amplicon sequences, offering advantages such as high speed, efficient memory usage, and the ability to handle large datasets. It provides functions for various tasks such as dereplication, clustering, chimera detection, and taxonomic assignment. VSEARCH is thus very efficient in retrieving the overall diversity of a sample. To adapt to the diversity of projects in metabarcoding, we facilitate the reprocessing of datasets with the possibility to adjust parameters. mbctools can also be launched in a headless mode, making it suited for integration into pipelines running on High-Performance Computing environments. mbctools is available at https://github.com/GuilhemSempere/mbctools, https://pypi.org/project/mbctools/.https://peercommunityjournal.org/articles/10.24072/pcjournal.501/metabarcodingampliconreproducibilityVSEARCHpipeline
spellingShingle Barnabé, Christian
Sempéré, Guilhem
Manzanilla, Vincent
Millan, Joel Moo
Amblard-Rambert, Antoine
Waleckx, Etienne
mbctools: A User-Friendly Metabarcoding and Cross-Platform Pipeline for Analyzing Multiple Amplicon Sequencing Data across a Large Diversity of Organisms
Peer Community Journal
metabarcoding
amplicon
reproducibility
VSEARCH
pipeline
title mbctools: A User-Friendly Metabarcoding and Cross-Platform Pipeline for Analyzing Multiple Amplicon Sequencing Data across a Large Diversity of Organisms
title_full mbctools: A User-Friendly Metabarcoding and Cross-Platform Pipeline for Analyzing Multiple Amplicon Sequencing Data across a Large Diversity of Organisms
title_fullStr mbctools: A User-Friendly Metabarcoding and Cross-Platform Pipeline for Analyzing Multiple Amplicon Sequencing Data across a Large Diversity of Organisms
title_full_unstemmed mbctools: A User-Friendly Metabarcoding and Cross-Platform Pipeline for Analyzing Multiple Amplicon Sequencing Data across a Large Diversity of Organisms
title_short mbctools: A User-Friendly Metabarcoding and Cross-Platform Pipeline for Analyzing Multiple Amplicon Sequencing Data across a Large Diversity of Organisms
title_sort mbctools a user friendly metabarcoding and cross platform pipeline for analyzing multiple amplicon sequencing data across a large diversity of organisms
topic metabarcoding
amplicon
reproducibility
VSEARCH
pipeline
url https://peercommunityjournal.org/articles/10.24072/pcjournal.501/
work_keys_str_mv AT barnabechristian mbctoolsauserfriendlymetabarcodingandcrossplatformpipelineforanalyzingmultipleampliconsequencingdataacrossalargediversityoforganisms
AT sempereguilhem mbctoolsauserfriendlymetabarcodingandcrossplatformpipelineforanalyzingmultipleampliconsequencingdataacrossalargediversityoforganisms
AT manzanillavincent mbctoolsauserfriendlymetabarcodingandcrossplatformpipelineforanalyzingmultipleampliconsequencingdataacrossalargediversityoforganisms
AT millanjoelmoo mbctoolsauserfriendlymetabarcodingandcrossplatformpipelineforanalyzingmultipleampliconsequencingdataacrossalargediversityoforganisms
AT amblardrambertantoine mbctoolsauserfriendlymetabarcodingandcrossplatformpipelineforanalyzingmultipleampliconsequencingdataacrossalargediversityoforganisms
AT waleckxetienne mbctoolsauserfriendlymetabarcodingandcrossplatformpipelineforanalyzingmultipleampliconsequencingdataacrossalargediversityoforganisms