02983nas a2200337   4500000000100000008004100001260003200042653002100074653001700095653001000112100001000122700001000132700001100142700001100153700000900164700001200173700001100185700001100196700000900207700000900216700001000225700001100235700001000246700000900256245010800265856009500373300001100468490000700479520214500486022001402631       2026                            d  c05/2026bFrontiers Media SA10aMachine learning10aMisdiagnosis10aChina1 aGuo Y1 aYin L1 aYang H1 aYang X1 aYu X1 aZhang C1 aZhou L1 aZhao F1 aLu S1 aHe Q1 aHan L1 aWang W1 aLiu Y1 aLi Y00aMachine learning methods to predict leprosy misdiagnosis in Yunnan Province, People's Republic of China  uhttps://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2026.1785606/pdf  a1 - 110 v143 a<p><strong>Background and objective</strong></p>

<p>Leprosy is a chronic infectious disease caused by Mycobacterium leprae that is often misdiagnosed. This study aimed to identify factors associated with leprosy misdiagnosis and develop and compare machine learning (ML) models to predict the risk of misdiagnosis.</p>

<p><strong>Methods</strong></p>

<p>A retrospective analysis was conducted on clinical and epidemiological data in 486 diagnosed leprosy patients. The outcome was a binary variable to indicate whether a patient had experienced a prior misdiagnosis. Features analyzed included sociodemographic factors, clinical characteristics, and epidemiological exposures. LASSO regression analysis performed feature selection. Class imbalance was handled using synthetic minority oversampling technique. Nine ML models were trained and validated with a 80–20 data split. The best model performance was evaluated based on AUC-ROC, sensitivity, and specificity. Important features were interpreted using the SHapley Additive exPlanation (SHAP) technique.</p>

<p><strong>Results</strong></p>

<p>Among 486 leprosy patients, 159 (32.7%) experienced misdiagnoses. Nineteen features were selected for model development. The best-performing model was Neural Network, which demonstrated the most balanced performance (AUC: 0.79 and 0.68, sensitivity: 0.93 and 0.78, specificity: 0.68 and 0.57 in train and test, respectively). The SHAP analysis identified key predictors associated with the detection of leprosy misdiagnosis, including mode of detection, aspartate aminotransferase level, gender, and the presence of skin lesions. In addition, ethnicity, education, leprosy reaction, household contact with an active case, and source of infection also contributed to the detection of leprosy misdiagnosis.</p>

<p><strong>Conclusion</strong></p>

<p>Applying ML to clinical data can effectively identify leprosy patients at high risk of being misdiagnosed using clinical, social and epidemiology characteristics. A ML-based support tool could aid frontline healthcare providers to reduce overlooking leprosy diseases.</p>
  a2296-2565