TY - JOUR
T1 - Multivariate analytics of chromatographic data
T2 - Visual computing based on moving window factor models
AU - Steinwandter, Valentin
AU - Šišmiš, Michal
AU - Sagmeister, Patrick
AU - Bodenhofer, Ulrich
AU - Herwig, Christoph
N1 - Publisher Copyright:
© 2018 Elsevier B.V.
PY - 2018/8/15
Y1 - 2018/8/15
N2 - Chromatography is one of the most versatile unit operations in the biotechnological industry. Regulatory initiatives like Process Analytical Technology and Quality by Design led to the implementation of new chromatographic devices. Those represent an almost inexhaustible source of data. However, the analysis of large datasets is complicated, and significant amounts of information stay hidden in big data. Here we present a new, top-down approach for the systematic analysis of chromatographic datasets. It is the goal of this approach to analyze the dataset as a whole, starting with the most important, global information. The workflow should highlight interesting regions (outliers, drifts, data inconsistencies), and help to localize those regions within a multi-dimensional space in a straightforward way. Moving window factor models were used to extract the most important information, focusing on the differences between samples. The prototype was implemented as an interactive visualization tool for the explorative analysis of complex datasets. We found that the tool makes it convenient to localize variances in a multidimensional dataset and allows to differentiate between explainable and unexplainable variance. Starting with one global difference descriptor per sample, the analysis ends up with highly resolute temporally dependent difference descriptor values, thought as a starting point for the detailed analysis of the underlying raw data.
AB - Chromatography is one of the most versatile unit operations in the biotechnological industry. Regulatory initiatives like Process Analytical Technology and Quality by Design led to the implementation of new chromatographic devices. Those represent an almost inexhaustible source of data. However, the analysis of large datasets is complicated, and significant amounts of information stay hidden in big data. Here we present a new, top-down approach for the systematic analysis of chromatographic datasets. It is the goal of this approach to analyze the dataset as a whole, starting with the most important, global information. The workflow should highlight interesting regions (outliers, drifts, data inconsistencies), and help to localize those regions within a multi-dimensional space in a straightforward way. Moving window factor models were used to extract the most important information, focusing on the differences between samples. The prototype was implemented as an interactive visualization tool for the explorative analysis of complex datasets. We found that the tool makes it convenient to localize variances in a multidimensional dataset and allows to differentiate between explainable and unexplainable variance. Starting with one global difference descriptor per sample, the analysis ends up with highly resolute temporally dependent difference descriptor values, thought as a starting point for the detailed analysis of the underlying raw data.
KW - Chromatography
KW - Data analysis
KW - Outlier detection
KW - Visualization
KW - Multivariate Analysis
KW - Data Interpretation, Statistical
KW - Algorithms
KW - Databases, Factual
UR - http://www.scopus.com/inward/record.url?scp=85048298939&partnerID=8YFLogxK
U2 - 10.1016/j.jchromb.2018.06.010
DO - 10.1016/j.jchromb.2018.06.010
M3 - Article
C2 - 29906679
AN - SCOPUS:85048298939
SN - 1570-0232
VL - 1092
SP - 179
EP - 190
JO - Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences
JF - Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences
ER -