Comparative evaluation of logistic regression and decision tree in predicting multidrug-resistant tuberculosis recurrence
https://doi.org/10.31549/2542-1174-2022-6-4-99-111
Abstract
Introduction. The rate of recurrence of respiratory tuberculosis is one of the indicators characterizing the effectiveness of anti-tuberculosis measures. Timely detection of patients with relapses is one of the priority approaches in solving the problem. Currently, with the introduction of big data processing technologies, namely artifi cial intelligence, various classifiers are used that take into account the totality of signs identifi ed in patients. The decision tree algorithm has proven itself widely in medical analytics. By analyzing these data, it is possible to classify the state of health and identify the first signs of tuberculosis recurrence.
Aim. To develop and evaluate models for predicting a recurrence in patients with pulmonary tuberculosis caused by multidrug-resistant pathogen (MDR-TB), using logistic regression and decision tree.
Materials and methods. The study included clinical and epidemiological, age, sex, social, medical and biological data of 346 patients with MDR-TB who successfully completed chemotherapy. Two observation groups were formed depending on the onset of recurrence of the disease in patients at least in the five-year follow-up period. The first group consisted of 35 patients with relapse, and the second one had 311 patients with no relapse. Statistical data processing for logistic regression was performed by IBM SPSS Statistics 23.0, the decision tree classifier was designed in the Scikit-learn 0.24.2 library in an interactive cloud environment with the Google Colaboratory code, using K-fold stratifi ed validation. The quantitative interpretation of the prediction results was carried out according to the ROC-curves (receiver operating characteristic) with the assessment of the AUC indicator (the area under the ROC-curve).
Results. The sensitivity and specifi city of the logistic regression model and the decision tree classifi er for predicting tuberculosis recurrence were 98.7 %, 88.6 % and 74.0 %, 97.0 %, respectively.
Conclusion. The created models can become a tool for predicting the recurrence of MDR-TB in cured patients.
Keywords
About the Authors
A. S. AlliluevRussian Federation
Alexander S. Alliluyev, Cand. Sci. (Med.), Deputy Chief Physician for Organizational and Methodological Work
634009
17, Rozy Luxemburg str.
Tomsk
O. V. Filinyuk
Russian Federation
Olga V. Filinyuk, Dr. Sci. (Med.), Professor, Head
Department of Phthisiology and Pulmonology
Tomsk
S. V. Аksenov
Russian Federation
Sergey V. Aksenov, Cand. Sci. (Tech.), Associate Professor
Department of Medical and Biological Cybernetics
Department of Information Technology, Engineering School of Information Technology and Robotics
Department of Theoretical Foundations of Informatics
Tomsk
E. E. Shnayder
Russian Federation
Ekaterina E. Schnaider, Medical Statistician
Organizational and Methodological Department
Tomsk
D. E. Filinyuk
Russian Federation
Daniil E. Filinyuk, Student
Engineering School of Information Technology and Robotics
Tomsk
Yu. A. Loginova
Russian Federation
Yulia A. Loginova, Medical Resident
Department of Phthisiology and Pulmonology
Tomsk
References
1. Global tuberculosis report 2021. Geneva: World Health Organization, 2021. Licence: CC BY-NC-SA 3.0 IGO.
2. Nechaeva O. B., Son I. M., Gordina A. V. et al. (2021). Resources and Activities of Anti-tuberculosis Facilities of the Russian Federation in 2019–2020 (Statistical Materials). Moscow, 112 p. (In Russ.)
3. Pyanzova T. V., Luzina N. V., Kopylova I. F., Sarapchina S. V., Zimina V. N. Clinical characteristics of tuberculosis relapses in the Kemerovo region. Tuberculosis and Lung Diseases. 2013; 9: 25-28. (In Russ.)
4. Rukosueva O. V., Vasilyeva I. A., Puzanov V. A. et al. Peculiarities of the course and diagnosis of pulmonary tuberculosis relapses. Tuberculosis and Lung Diseases. 2011; 7: 138-139. (In Russ.)
5. Sagalbaeva T. Zh., Mordyk A. V., Kortusova L. N., Evdokimenko S. I. Formation reasons and structure of clinical forms at early and late relapses of the disease. Tuberculosis and Lung Diseases. 2015; 5: 163-164. (In Russ.)
6. Sharashova E. E., Kholmatova K. K., Gorbatova M. A., Grjibovski A. M. Мultivariable logistic regression using SPSS software in health research. Science and Health-care. 2017; 4: 5-26. (In Russ.)
7. Narkevich A. N., Vinogradov K. A., Grjibovski А. М. Intelligent data analysis in biomedical research: classification trees. Human Ecology. 2021; 3: 54-64. (In Russ.)
8. Mello F. C. d. Q., Bastos L. G. d. V., Soares S. L. M. et al. Predicting smear negative pulmonary tuberculosis with classification trees and logistic regression: a cross-sectional study. BMC Public Health. 2006; 6: 43. DOI: 10.1186/1471-2458-6-43.
9. Aguiar F. S., Almeida L. L., Ruffino-Netto A. et al. Classification and regression tree (CART) model to predict pulmonary tuberculosis in hospitalized patients. BMC Pulm Med. 2012; 12: 40. DOI: 10.1186/1471-2466-12-40.
10. Kalhori S. R. N., Zeng X. J. Evaluation and comparison of different machine learning methods to predict out-come of tuberculosis treatment course. J. Intell. Learn. Syst. Appl. 2013; 5: 10.
11. Tan D., Wang B., Li X. et al. Identification of risk factors of multidrug-resistant tuberculosis by using classification tree method. Am. J. Trop. Med. Hyg. 2017; 97 (6): 1720-1725. DOI: 10.4269/ajtmh.17-0029.
12. Kulkarni V., Queiroz A. T. L., Sangle S. et al. A two-gene signature for tuberculosis diagnosis in persons with advanced HIV. Front. Immunol. 2021; 12: 631165. DOI: 10.3389/fimmu.2021.631165.
Review
For citations:
Alliluev A.S., Filinyuk O.V., Аksenov S.V., Shnayder E.E., Filinyuk D.E., Loginova Yu.A. Comparative evaluation of logistic regression and decision tree in predicting multidrug-resistant tuberculosis recurrence. Journal of Siberian Medical Sciences. 2022;(4):99-111. https://doi.org/10.31549/2542-1174-2022-6-4-99-111