Back
In Colombia, as in the world, CKD is one of the main causes of morbimortality and was the first disease to be considered a high-cost disease. Despite multiple efforts, the incidence of renal failure and the need for replacement therapy continues to increase. Worryingly, in younger and younger people. Clinical practice and decision making are difficult due to the high fragmentation of the health system and the lack of comprehensiveness in primary care and prevention programs. Thus, there is still great uncertainty as to which patients are stabilized and which will experience renal failure. To date, the question of great clinical impact has not been answered: what annual loss of renal function can predict the development of renal failure before it occurs?
A longitudinal study from a primary and secondary source of biomarkers measurements in patients with chronic kidney disease during a follow-up period of 7 years between 2013 and 2020 in Medellín and the metropolitan area, Colombia. This is embedded in the doctoral thesis project: “Prediction of progression to end-stage chronic kidney disease (ESRD) in patients with stage 1 to 4 chronic kidney disease in Medellín: design and external validation of a model with a potential clinical utility”, funded with a doctoral grant from the Ministry of Sciences of Colombia.
Multiple factors were collected in repeated measures of potentially predictive factors and two outcomes were modeled: progression to renal failure and the change or delta of annual decline in glomerular filtration rate that increased the likelihood of developing renal failure.
Classical statistical models were run and 3 machine learning models were then used and compared in terms of predictive ability, accuracy, and applicability in today's complex and challenging renal care practice.
During follow-up, data were collected from 1650 patients who met eligibility criteria and an average of 3 measurements per variable per patient, for a total of almost 198000 repeated measurements. A renal failure rate of almost 4% was found and, strikingly, it was found that economic income, socioeconomic conditions, type of health system affiliation and school level significantly increased the probability of developing renal failure. To establish a potentially useful predictive model, the classic models: logistic regression, Poisson regression, linear regression and joint model failed to predict the outcomes and no significant results were obtained. Taking advantage of the large dataset, we ran 3 machine learning models and 2 subcohorts, test and training: random forest classification, regression, spark logistic and xgboost. The latter was the most accurate and had the best predictive performance. The selected model was validated in an independent cohort, in which the predictive capacity of the machine learning model was maintained, showing that it can be useful and applicable to the clinical setting, especially when there is high complexity in the evolution of patients.
In the face of heterogeneous, unstructured health data, unbalanced by the way care occurs in the real world, classical analytical models failed to capture the complexity and failed to be useful for understanding and decision making in the face of advancing CKD.
Several social determinants, such as income, type of affiliation to the health system and educational level, were associated with an increase in the risk of developing kidney failure, showing how inequities and poor health outcomes increase in people's living conditions. .
Thanks to some machine learning techniques, we were able to find patterns of progression to kidney failure, previously almost undetectable, given the complexity and limitations of the data. It was possible to answer the question and we detected that a minimal loss of kidney function, almost unnoticed even by clinical experts, when associated with certain factors, incluided social determinants, increases the risk of kidney failure.