Osteoporosis, usually termed the “silent disease”, is a prevalent situation that reduces bone density, predisposing people to elevated fracture risk. Notably, the absence of overt signs till a fracture happens underscores the urgency for early detection and preventive strategies19. Fractures, significantly hip fractures, related to osteoporosis, usually lead to substantial morbidity, elevated mortality, and vital health-care costs20. The societal and financial implications of osteoporosis-related fractures make predicting the disease an crucial not simply from a scientific perspective but additionally from public well being and financial viewpoints21.Early prediction and identification of osteoporosis can pave the manner for well timed interventions, probably decelerating and even reversing bone loss. This not solely diminishes fracture risk but additionally bolsters the high quality of life for the aged inhabitants, making certain higher independence and diminished healthcare expenditure22. Interventions, which vary from life-style modifications to pharmaceutical therapies, have proven to be significantly more practical when osteoporosis is recognized at nascent stages23.The present analysis serves as a testomony to the potential of machine learning in advancing osteoporosis prediction, highlighting a novel strategy that melds the energy of numerous predictive algorithms24. Existing methodologies primarily rely on bone mineral density (BMD) assessments utilizing DXA scans25. Although efficient, these assessments aren’t ubiquitously accessible, might be cost-prohibitive, and infrequently are performed when scientific signs manifest, probably delaying well timed intervention.In many sensible eventualities, particularly in resource-limited settings like communities, it is likely to be difficult or cost-prohibitive to acquire complete life-style data, laboratory check, or superior imaging outcomes. Thus, constructing predictive fashions utilizing data that may be extracted from major healthcare information or neighborhood surveys provides a promising strategy for early screening and detection of osteoporosis in these settings. Cheng Li26 and colleagues efficiently predicted the risk of rotator cuff tears in hospital outpatients utilizing easy questionnaire data and bodily examination findings with machine learning techniques. Similarly, Limin Wang et al.27 used well being questionnaire indicators and regression algorithms to make correct predictions for symptomatic knee osteoarthritis. By figuring out high-risk sufferers by means of easy indicators and recommending additional exact medical examinations for them, this strategy will help scale back pointless medical assessments and save on healthcare prices.This examine, by capitalizing on nationwide major healthcare data from Germany, provides a non-invasive and environment friendly means to predict osteoporosis risk based on well being indicators and chronic circumstances. The broad inclusion of sufferers spanning numerous well being backgrounds ensures the mannequin’s generalizability and applicability in real-world settings. Our intention is to develop a preclinical mannequin that would contribute to early warning and early detection and prognosis for high-risk populations. In this examine, we didn’t embrace medical laboratory check indicators and omics data as predictive components. While this will scale back the mannequin’s efficiency, it additionally has the benefit of decreasing the complexity of the mannequin and enhancing its practicality. Innovation in the subject of machine learning doesn’t all the time imply utilizing the most superior algorithms or complicated characteristic engineering28. Sometimes, simplifying the growth of fashions to enhance their universality and value represents a major type of innovation. Simple fashions are simpler for different researchers to replicate and validate and are extra possible to implement in real-world settings. Our examine outcomes present that the mannequin we developed has an AUC of 0.76, indicating good predictive efficiency.The selection of algorithms in the current examine was pivotal in making certain sturdy prediction efficiency. The preliminary choice included a spread of algorithms, out of which LR, ADA, and GBC emerged as the front-runners in phrases of the AUC metric. Previous analysis in medical diagnostics has emphasised the significance of the AUC as an indicative measure of the mannequin’s functionality to discriminate between constructive and unfavorable instances29. Although the analysis by Meng, Y., et al.30 means that sequential fashions, akin to GRU or LSTM, might outperform non-sequential fashions like LR or XGB, the benefits of these fashions might not be totally leveraged in the context of cross-sectional data alone. In this examine, contemplating the traits of the dataset used for modeling, non-sequential fashions had been adopted as the remaining predictive fashions, which additionally achieved good predictive efficiency. Our findings are congruent with latest literature suggesting the promise of these algorithms in health-related prediction tasks31,32,33.Ensemble strategies have constantly demonstrated their mettle in bettering prediction accuracy by combining the strengths of a number of fashions and ameliorating particular person mannequin limitations34. The use of a stacked ensemble in our examine—a mannequin synergizing the robustness of LR, ADA, and GBC—considerably augmented the AUC throughout inner validation. This strategy capitalizes on the distinct resolution boundaries supplied by every algorithm, thus offering a holistic, complete prediction. This strategy provides larger predictive accuracy over particular person fashions, a discovering in alignment with up to date research on ensemble methods35.The optimum threshold chance of 0.52 derived from the Youden index underscores the balanced consideration of each sensitivity (true constructive fee) and specificity (true unfavorable fee) in the examine. This ensures not solely the appropriate identification of precise osteoporosis instances but additionally minimizes false alarms, which might be crucial in scientific functions to keep away from overdiagnosis and pointless interventions. The achieved carry worth of 1.9 for the stacker mannequin accentuates its capability to successfully determine osteoporosis instances in contrast to random choice, validating its scientific utility.Furthermore, the complete characteristic choice course of and rigorous validation reaffirm the mannequin’s robustness and reliability. The utility of SHAP values for characteristic significance not solely fosters transparency in machine learning predictions but additionally provides scientific insights, serving to healthcare practitioners perceive and prioritize risk factors36.The SHAP values, a complicated device for mannequin interpretability, had been instrumental in figuring out the salience of every characteristic inside our predictive framework. Age and gender emerged as the most paramount components, a discovering that resonates with the broader osteoporosis literature. The long-established relationship between advancing age and decreased bone density makes age a pivotal predictor for osteoporosis risk37. Gender-specific variations, particularly post-menopausal modifications in girls, exacerbate the risk of osteoporosis, emphasizing its significance in our model38.The significance of lipid metabolism issues in predicting osteoporosis in our mannequin offered intriguing insights. Recent research have begun to determine a possible affiliation between dyslipidemia and bone mineral density (BMD) alterations39,40. Lipids play a job in bone metabolism, and aberrations in lipid profiles might adversely have an effect on bone well being. Our mannequin’s emphasis on most cancers as a risk issue underscores the multifaceted relationship between most cancers and osteoporosis. Some therapies for most cancers, particularly these involving hormone therapies, can speed up bone loss, making sufferers extra prone to osteoporosis41,42. COPD has additionally been linked to low bone mineral density and a better risk of fractures. Pulmonary dysfunction and decreased BMD share underlying inflammatory pathways. The chronic inflammatory state in COPD can disrupt bone metabolism, main to elevated osteoporosis risk43. Hypertension has been related to an elevated risk of osteoporosis, probably due to alterations in calcium homeostasis, in addition to the results of antihypertensive medications44. Stroke sufferers additionally face an elevated risk of osteoporosis and fractures, possible due to immobilization and neuronal harm affecting bone metabolism. Similarly, coronary heart failure, CHD, and chronic kidney disease have all been related to an elevated risk of osteoporosis and fractures45,46,47.The crucial of early osteoporosis prediction has by no means been clearer. As populations age globally, the public well being burden of osteoporotic fractures is poised to rise. Against this backdrop, our examine stands as a significant stride in direction of enhancing osteoporosis predictive modalities. Utilizing the open-source major healthcare dataset from IMS HEALTH, which included information from a big quantity of sufferers, we endeavored to develop a machine learning-based predictive mannequin. With additional analysis and validation, we hope the mannequin will help neighborhood healthcare employees in screening sufferers at excessive risk of osteoporosis throughout well being follow-ups utilizing easy indicators. Personalized well being recommendation is given to these high-risk sufferers, and additional medical assessments akin to laboratory assessments or radiology are really useful to make clear the prognosis. This might assist to scale back pointless medical assessments and save healthcare prices whereas making certain that the advantages to osteoporosis sufferers.Several algorithms had been assessed in our endeavor, with the stacked ensemble strategy of combining Logistic Regression (LR), Ada Boost Classifier (ADA), and Gradient Boosted Classifier (GBC) rising as significantly promising. The superiority of this ensemble mannequin underscores the inherent complexities of osteoporosis prediction. It emphasizes that the disease’s multifaceted nature could also be greatest captured by drawing from the strengths of a number of algorithms.
https://www.nature.com/articles/s41598-024-56114-1