Entropy-extreme concept of data gaps filling in a small-sized collection
Annotation: The article investigates the process of filling data gaps in a small-sized collection, which generalizes information about periodic measurement of input and output parameters of a target object. To fill the data gaps, a concept is proposed based on generating a committee of entropy-optim...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-03-01
|
Series: | Egyptian Informatics Journal |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1110866525000143 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Annotation: The article investigates the process of filling data gaps in a small-sized collection, which generalizes information about periodic measurement of input and output parameters of a target object. To fill the data gaps, a concept is proposed based on generating a committee of entropy-optimal trajectories through sampling probability density functions of parameters from a stochastic parameterized model trained on relevant data. The concept is generalized to cases of filling gaps in output data, input data, and both those data spaces. Filling gaps in output data is implemented using entropy-extreme estimation of probability density functions for parameters of the model and errors of measurement. In the case of addressing missing values in input data, these are interpreted as results of transforming a sequence of independent stochastic vectors introduced into a model structurally identical to that formalized for filling gaps in output data. Thus, the proposed concept inherits the benefits of both parametric estimation and using a trained model of the target process and non-parametric estimation of undefined characteristics that distort data. The proposed concept was tested on the task of filling gaps in a collection consisting of 35 tuples with measurement results of three attributes. It was considered that the imperfection of the measurement procedure caused variability in the obtained data at the level of 15% of their absolute value. Less than 20% of the data from the collection was used to train the corresponding entropy-extreme model. The relative error of the filled missing data was 0.21. |
---|---|
ISSN: | 1110-8665 |