Comprehensive modeling of cell culture profile using Raman spectroscopy and machine learning

In this research, we developed a Python program that automates the optimization of principal element numbers within the spectral area and PLS regression for a variety of goal compounds. We used PLS regression for example for mannequin building. These situations point out whether or not mannequin accuracy will increase or decreases, and Bayesian optimization proves to be a strong approach for optimization with low computational complexity. Particularly when optimizing the mannequin for a number of targets, automation and computational pace are essential, making it appropriate for establishing fashions for numerous parts. Furthermore, we utilized the identical methodology to linear regression, ridge regression, XGBoost, and neural community, demonstrating the flexibility of the optimization of PLS regression modeling. This showcases the usefulness of Python programming in deciding on the optimum model-building situations.In this research, we utilized the Raman sampling module of a small-scale culture vessel Ambr 250 to amass Raman spectral information. Constructing a Raman mannequin requires a big quantity of information factors. Conventionally, a Raman spectrometer is inserted right into a glass vessel to amass information, however this methodology is time-consuming, as information on just one culture could be acquired per sensor. In distinction, our strategy permits the simultaneous acquisition of Raman spectral information from a number of cultures, enabling the development of Raman fashions using information from a single culture. This methodology proves to be a easy and environment friendly strategy for establishing Raman fashions, aligning the Raman spectrum with the drug growth timeline, and serving as a beneficial monitoring methodology in course of growth and GMP manufacturing. In this research, we constructed fashions using Raman spectra measured in a microfluidic channel. It was reported that comparable fashions could be created using this measurement approach and by straight measuring with sensors throughout the bioreactor25. However, you will need to think about the potential for heterogeneity within the vessel, comparable to dissolved oxygen, when scaling up from fluid analysis37. Even when using Raman sensors to measure compound concentrations, it’s mandatory to contemplate this heterogeneity. Also, for mannequin validation, we used 5 reactors for mannequin building and the remaining one reactor as a take a look at dataset to judge the predictive accuracy. This was executed to obviously observe the time-course adjustments within the predicted information of the take a look at dataset, as proven in Fig. 2F. However, when really establishing fashions using this methodology, even larger accuracy fashions could also be constructed by using cross-validation, the place take a look at information is randomly sampled from the info of all reactors.PLS regression is usually used for establishing Raman fashions. Datasets of Raman spectra include quite a few explanatory variables, making them appropriate for regression strategies that contain dimensionality discount, comparable to PLS regression. PLS regression has the benefits of excessive pace and complete mannequin building for every measured object. In this research, we additionally examined linear regression, ridge regression, XGBoost, and neural community. Some machine-learning strategies exhibited modeling efficiency surpassing that of PLS regression. Interestingly, the impact of enchancment of mannequin accuracy differ relying on the class of the compound. In this research, it was prompt that metabolites and nutritional vitamins had a better affect on bettering mannequin accuracy by means of machine learning methodology choice in comparison with amino acids. These compounds belonged to a gaggle with comparatively low accuracy in PLS regression, and it’s doable that the impact of bettering accuracy is larger for compounds with low accuracy in PLS regression. To exhibit the development in mannequin accuracy, it was demonstrated that strategies apart from PLS regression, comparable to XGBoost, can enhance modeling accuracy in sure topics, as proven by the BiP ranges in Fig. 4. PLS regression is a linear regression methodology that selects principal parts to seize a linear relationship between predictor and response variables. It reduces multicollinearity and permits correct fashions for multivariate information. XGBoost, alternatively, is a non-linear algorithm that mixes choice bushes to seize advanced patterns. It evaluates function significance and employs ensemble learning for extra correct predictions. For some classes of compound, XGBoost could outperformed PLS regression on account of its potential to seize non-linear relationships, choose extra applicable options, and cut back bias and variance by means of ensemble learning. Hyperparameter tuning was not carried out for XGBoost and neural networks on this research, however performing hyperparameter tuning in superior computational environments could result in the development of fashions that outperform PLS regression, linear regression, and ridge regression.Through the excellent evaluation of Raman fashions of numerous compounds, it was discovered that the mannequin accuracy for amino acids was usually excessive, whereas the accuracy for nutritional vitamins was decrease. This discrepancy could be defined by two doable components. First, the Raman spectra could not detect adjustments when the compound focus is just too low. Second, the accuracy of the offline measurement, which detecting the compound focus using LC–MS, could have been compromised at low concentrations, resulting in decrease accuracy within the Raman measurement mannequin. To enhance the accuracy of the Raman measurement mannequin, there’s a want to reinforce the accuracy of offline measurements. To additional enhance mannequin accuracy, it’s price contemplating incorporating info apart from Raman spectra into the mannequin. Previous research have proposed fashions that mix the computational fluid dynamics models38 or embody process-related impurities and kinetics of every cultivation data39, suggesting that combining this info with Raman spectral information could result in even larger accuracy fashions. Additionally, bettering the mannequin building strategies is anticipated to additional improve mannequin accuracy. Narayanan et al. proposed a mannequin building methodology that mixes Kalman filter40, whereas Poth et al. comprehensively validate algorithms apart from these used on this study41. It is believed that by extending the mannequin building strategies as reported in these research, the accuracy of fashions for compounds with decrease accuracy is perhaps additional improved. Furthermore, using numerous variable choice strategies along with Bayesian optimization, as mentioned within the Introduction, may have the potential to enhance accuracy. In this research, the specificity of the measurement was confirmed for glucose by observing the focus enhance upon the addition of a glucose resolution. Ideally, an addition experiment needs to be carried out for every compound to substantiate specificity. However, by establishing a mannequin for information from a number of cultures with totally different profiles, specificity could be exhibited. In this research, information from six culture vessels with distinct profiles had been used, and the measurement fashions of every culture vessel had been constructed, suggesting the specificity of the measurement outcomes.Raman spectra primarily detect covalent bonds of compounds in resolution, theoretically stopping the detection of metallic ions, amongst others. Interestingly, by means of exhaustive mannequin building, fashions had been constructed for compounds that theoretically couldn’t be detected in Raman spectra, comparable to hydrogen ions, oxygen, carbon dioxide, and metallic ions. Additionally, fashions had been constructed for variables with out a bodily presence, comparable to mobile viability. Some compound ranges correlated with the values to be measured, not directly permitting the development of measurement fashions (Fig. 4E). For occasion, mobile viability is understood to correlate with LDH42, suggesting the likelihood of measuring mobile viability not directly using LDH degree decided with a Raman spectrometer as a proxy. This permits the measurement of not solely the focus of a particular compound but additionally all variables that characterize a cell culture by means of sure calculations. It can also be doable to estimate the degrees of compounds based mostly on the spectral area and mannequin coefficients used for mannequin building, contributing to the identification of metabolites that correlate with particular parameters.This research demonstrates the excellent building of extremely exact Raman fashions for measuring the concentrations of numerous compounds. This permits steady acquisition of numerous culture information using a Raman spectrometer, enabling real-time monitoring and suggestions management of culture situations. While earlier Raman measurements and suggestions controls centered on glucose and amino acid concentrations, the exhaustive mannequin building strategy could facilitate sooner medium growth by constantly optimizing a wider vary of parts.This approach could be simply expanded to mannequin components comparable to omics information. By making use of the strategy used on this research, modeling could be carried out for numerous parameters past medium parts and metabolites. We efficiently constructed a predictive mannequin for BiP, an endoplasmic reticulum stress-related issue, with good precision. Additionally, we constructed a mannequin for oxidative glutathione, an oxidative stress-related issue, suggesting the potential for monitoring not solely compound concentrations but additionally numerous stress markers. Raman modeling could be thought of a function extraction approach for quantifying culture traits, and it’s extremely appropriate with AI-related applied sciences, which have seen exceptional developments lately. Previous research predicted transcriptome information from Raman spectra43, offering a basis for predicting multivariate or numerical values. With these applied sciences, we are able to develop extra complete and correct fashions for a broad vary of parameters.

https://www.nature.com/articles/s41598-023-49257-0

Recommended For You