Automated identification of bulk structures, two-dimensional materials, and interfaces using symmetry-based clustering

Abstract With the rapidly increasing amount of materials data being generated in a variety of projects, efficient and accurate classification of atomistic structures is essential. A current barrier to effective database queries lies in the often ambiguous, inconsistent, or completely missing classif...

Full description

Saved in:
Bibliographic Details
Main Authors: Thea Denell, Lauri Himanen, Markus Scheidgen, Claudia Draxl
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:npj Computational Materials
Online Access:https://doi.org/10.1038/s41524-024-01498-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823861798691930112
author Thea Denell
Lauri Himanen
Markus Scheidgen
Claudia Draxl
author_facet Thea Denell
Lauri Himanen
Markus Scheidgen
Claudia Draxl
author_sort Thea Denell
collection DOAJ
description Abstract With the rapidly increasing amount of materials data being generated in a variety of projects, efficient and accurate classification of atomistic structures is essential. A current barrier to effective database queries lies in the often ambiguous, inconsistent, or completely missing classification of existing data, highlighting the need for standardized, automated, and verifiable classification methods. This work proposes a robust solution for identifying and classifying a wide spectrum of materials through an iterative technique, called symmetry-based clustering (SBC). Because SBC is not a machine learning-based method, it requires no prior training. Instead, it identifies clusters in atomistic systems by automatically recognizing common unit cells. We demonstrate the potential of SBC to provide automated, reliable classification and to reveal well-known symmetry properties of various materials. Even noisy systems are shown to be classifiable, showing the suitability of our algorithm for real-world data applications. The software implementation is provided in the open-source Python package, MatID, exploiting synergies with popular atomic-structure manipulation libraries and extending the accessibility of those libraries through the NOMAD platform.
format Article
id doaj-art-4786cf33cd1346fcb17dfaa4ad093e73
institution Kabale University
issn 2057-3960
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series npj Computational Materials
spelling doaj-art-4786cf33cd1346fcb17dfaa4ad093e732025-02-09T12:46:40ZengNature Portfolionpj Computational Materials2057-39602025-02-011111910.1038/s41524-024-01498-xAutomated identification of bulk structures, two-dimensional materials, and interfaces using symmetry-based clusteringThea Denell0Lauri Himanen1Markus Scheidgen2Claudia Draxl3Physics Department and CSMB, Humboldt-Universität zu BerlinPhysics Department and CSMB, Humboldt-Universität zu BerlinPhysics Department and CSMB, Humboldt-Universität zu BerlinPhysics Department and CSMB, Humboldt-Universität zu BerlinAbstract With the rapidly increasing amount of materials data being generated in a variety of projects, efficient and accurate classification of atomistic structures is essential. A current barrier to effective database queries lies in the often ambiguous, inconsistent, or completely missing classification of existing data, highlighting the need for standardized, automated, and verifiable classification methods. This work proposes a robust solution for identifying and classifying a wide spectrum of materials through an iterative technique, called symmetry-based clustering (SBC). Because SBC is not a machine learning-based method, it requires no prior training. Instead, it identifies clusters in atomistic systems by automatically recognizing common unit cells. We demonstrate the potential of SBC to provide automated, reliable classification and to reveal well-known symmetry properties of various materials. Even noisy systems are shown to be classifiable, showing the suitability of our algorithm for real-world data applications. The software implementation is provided in the open-source Python package, MatID, exploiting synergies with popular atomic-structure manipulation libraries and extending the accessibility of those libraries through the NOMAD platform.https://doi.org/10.1038/s41524-024-01498-x
spellingShingle Thea Denell
Lauri Himanen
Markus Scheidgen
Claudia Draxl
Automated identification of bulk structures, two-dimensional materials, and interfaces using symmetry-based clustering
npj Computational Materials
title Automated identification of bulk structures, two-dimensional materials, and interfaces using symmetry-based clustering
title_full Automated identification of bulk structures, two-dimensional materials, and interfaces using symmetry-based clustering
title_fullStr Automated identification of bulk structures, two-dimensional materials, and interfaces using symmetry-based clustering
title_full_unstemmed Automated identification of bulk structures, two-dimensional materials, and interfaces using symmetry-based clustering
title_short Automated identification of bulk structures, two-dimensional materials, and interfaces using symmetry-based clustering
title_sort automated identification of bulk structures two dimensional materials and interfaces using symmetry based clustering
url https://doi.org/10.1038/s41524-024-01498-x
work_keys_str_mv AT theadenell automatedidentificationofbulkstructurestwodimensionalmaterialsandinterfacesusingsymmetrybasedclustering
AT laurihimanen automatedidentificationofbulkstructurestwodimensionalmaterialsandinterfacesusingsymmetrybasedclustering
AT markusscheidgen automatedidentificationofbulkstructurestwodimensionalmaterialsandinterfacesusingsymmetrybasedclustering
AT claudiadraxl automatedidentificationofbulkstructurestwodimensionalmaterialsandinterfacesusingsymmetrybasedclustering