02983nas a2200337 4500000000100000008004100001260003200042653002100074653001700095653001000112100001000122700001000132700001100142700001100153700000900164700001200173700001100185700001100196700000900207700000900216700001000225700001100235700001000246700000900256245010800265856009500373300001100468490000700479520214500486022001402631 2026 d c05/2026bFrontiers Media SA10aMachine learning10aMisdiagnosis10aChina1 aGuo Y1 aYin L1 aYang H1 aYang X1 aYu X1 aZhang C1 aZhou L1 aZhao F1 aLu S1 aHe Q1 aHan L1 aWang W1 aLiu Y1 aLi Y00aMachine learning methods to predict leprosy misdiagnosis in Yunnan Province, People's Republic of China uhttps://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2026.1785606/pdf a1 - 110 v143 a

Background and objective

Leprosy is a chronic infectious disease caused by Mycobacterium leprae that is often misdiagnosed. This study aimed to identify factors associated with leprosy misdiagnosis and develop and compare machine learning (ML) models to predict the risk of misdiagnosis.

Methods

A retrospective analysis was conducted on clinical and epidemiological data in 486 diagnosed leprosy patients. The outcome was a binary variable to indicate whether a patient had experienced a prior misdiagnosis. Features analyzed included sociodemographic factors, clinical characteristics, and epidemiological exposures. LASSO regression analysis performed feature selection. Class imbalance was handled using synthetic minority oversampling technique. Nine ML models were trained and validated with a 80–20 data split. The best model performance was evaluated based on AUC-ROC, sensitivity, and specificity. Important features were interpreted using the SHapley Additive exPlanation (SHAP) technique.

Results

Among 486 leprosy patients, 159 (32.7%) experienced misdiagnoses. Nineteen features were selected for model development. The best-performing model was Neural Network, which demonstrated the most balanced performance (AUC: 0.79 and 0.68, sensitivity: 0.93 and 0.78, specificity: 0.68 and 0.57 in train and test, respectively). The SHAP analysis identified key predictors associated with the detection of leprosy misdiagnosis, including mode of detection, aspartate aminotransferase level, gender, and the presence of skin lesions. In addition, ethnicity, education, leprosy reaction, household contact with an active case, and source of infection also contributed to the detection of leprosy misdiagnosis.

Conclusion

Applying ML to clinical data can effectively identify leprosy patients at high risk of being misdiagnosed using clinical, social and epidemiology characteristics. A ML-based support tool could aid frontline healthcare providers to reduce overlooking leprosy diseases.

 a2296-2565