Machine Learning Based Prediction of Mortality in Patients Undergoing Cardiac Resynchronization Therapy


The SEMMELWEIS-CRT (perSonalizEd assessMent of estiMatEd risk of mortaLity With machinE learnIng in patientS undergoing CRT implantation) Score was created to predict the risk of all-cause mortality in patients undergoing CRT (Cardiac Resynchronization Therapy) implantation. It is a random forest based risk stratification tool that provides personalized predication of all-cause mortality by capturing high-dimensional, non-linear interactions among the multitude of predictors.

The Score was developed from a retrospective database of 1510 patients undergoing successful CRT implantation at the Heart and Vascular Center of the Semmelweis University, Budapest, Hungary. It uses 33 pre-implant clinical variables whose majority are routinely assessed during the management of heart failure, therefore, they are readily available from electronic medical records. Moreover, the SEMMELWEIS-CRT Score was designed in a way to tolerate moderate number of missing parameters. However, with special regards to the most important features (age at CRT implantation, gender, QRS morphology, New York Heart Association functional class, left ventricular ejection fraction, height, weight, type of atrial fibrillation, glomerular filtration rate, serum creatinine, serum sodium, hemoglobin concentration), high percentage of missing values may reduce the reliability of the prediction.

We used the patients’ follow-up data to generate 6 classes of possible outcomes: death during the first (class 1), the second (class 2), the third (class 3), the fourth (class 4), the fifth year after CRT implantation (class 5), and no death during the first 5 years following the implantation (class 6). The task of the evaluated machine learning algorithms was to predict the probability distribution (i.e. class membership probabilities) of each patient over these classes based on the pre-implant clinical features. Our evaluation of machine learning algorithms was rigorous, including trials of numerous different classifiers (logistic regression, ridge regression, support vector machines, k-nearest neighbors classifier, gradient boosting classifier, random forest, conditional inference random forest and multi-layer perceptron) with stratified 10-fold cross-validation and within a wide hyperparameter space. Among the evaluated algorithms, the best performing algorithm was the random forest classifier, therefore, it is used in the SEMMELWEIS-CRT Score. The outputs of the random forest model are series of 6 values representing the previously defined class membership probabilities. Cumulative probabilities are calculated by summing these values until the given year of follow-up. Then the computed cumulative probabilities are calibrated using Platt’s scaling and the survival curve can be plotted for each patient (displayed in the ‘Results’ section of the WebCalculator). Expected survival is also calculated for each patient from the annual values of calibrated cumulative probabilities according to the following formula:

Expected Surival = `\sum_{i=0}^5 (P_{i+1} - P_i) \times i, P_0=0, P_6=1`

where Pi is the calibrated cumulative probability of all-cause mortality at year i.

With an average area under the receiver-operating characteristic curve greater than 0.700 (average Brier score <0.20), the SEMMELWEIS-CRT Score effectively predicted all-cause mortality in our training database during the 10-fold cross-validation. To determine whether the model preserves its accuracy when new data is fed into it, we tested it on an independent cohort of 158 CRT patients and the Score exhibited similarly good discriminative capabilities as observed during the cross-validation procedure.

The observed high efficacy of our random forest model suggests that machine learning should be integrated into the individual risk assessment of patients undergoing CRT implantation. We foresee that the role of machine learning based prognostic risk scores will become increasingly relevant in the near future and structured, dense databases in combination with state-of-the-art analytic approaches will pave the way to precision cardiovascular medicine. To achieve this goal, we are looking for clinical and industry partnerships aiming external validation, further improvement in accuracy and international utilization of the SEMMELWEIS-CRT Score.

For further details and information about the development of the SEMMELWEIS-CRT Score, do not hesitate to contact us.

The SEMMELWEIS-CRT WebCalculator supports the latest versions of the Chrome, Firefox, and Safari browsers on computers and mobile devices.