Artificial Intelligence to Overcome Natural Stupidity

#NephJC Chat

Tuesday Sept 24, 2019 at 9 pm Eastern Daylight Time

Wednesday Sept 25, 2019 at 9 pm Indian Standard Time

Wednesday Sept 25, 2019 at 9 pm British Summer Time

 

Nature. 2019 Aug;572(7767):116-119. doi: 10.1038/s41586-019-1390-1. Epub 2019 Jul 31.

A clinically applicable approach to continuous prediction of future acute kidney injury.

Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, Mottram A, Meyer C, Ravuri S, Protsyuk I, Connell A, Hughes CO, Karthikesalingam A, Cornebise J, Montgomery H, Rees G, Laing C, Baker CR, Peterson K, Reeves R, Hassabis D, King D, Suleyman M, Back T, Nielson C, Ledsam JR, Mohamed S.

PMID: 31367026

A post by Frank Harrell on Data Methods

Deep Mind Blog post

Editorial by Eric Topol

Editorial in Nature Reviews Nephrology by John Kellum and Azra Bihorac

Michael Joyner of the promise/hype of AI in Stat News

Introduction

Acute kidney injury (AKI) affects one in five inpatient admissions in the United States. Even a modest change in serum creatinine has been associated with increased cost, length of stay and mortality, because of this, it is imperative to combat the development of AKI in patients. In certain circumstances, AKI can be prevented with early treatment however identifying which patients are at high risk can pose a challenge for practitioners. The current diagnostic pathway for detecting AKI depend on a rise in serum creatinine level, but the rise in serum creatinine level lags behind renal injury. Biomarkers are in the works (see our discussion of Nephrocheck) but are not universally used. Recent work using sequential AKI predictive models have not been clinically applicable and have focused on predictions across a short time horizon. This study utilized an artificial intelligence (AI) method known as deep learning to assess the likelihood patients would development AKI within the next 48 hours. Diagnosing AKI before it even occurs. Like precrime. How would this work?

DeepMind, the brain behind this, was founded in Cambridge in 2010, and was taken over by Google in 2014. One of their famous success stories has been the development of AlphaGo, an AI that defeated professional Go players, including the top ranked human, memoralized in this Netflix docudrama. Deep Mind Health has been working with the UK National Health Services for a while, with a ‘Streams’ app and other ventures. In 2017, there was an outcry about data handling and privacy, and perhaps for that reason, the company went to a different source in the current project.

The Study

Methods

Study Design

  • A multi-site, deidentified retrospective dataset

  • US Department of Veterans Affairs Healthcare system across 1,243 health facilities including inpatient and outpatient sites

 Inclusion criteria

  • Age 18-90

  • Admitted for secondary care to medical or surgical services from the beginning of October 2011 to the end of September 2015

  • At least one year of heath record date before admission

 Exclusion criteria

  • Patient with conditions that were considered sensitive - HIV/AIDS, sexually transmitted diseases, substance abuse, admitted to mental health services

  • Four sites excluded because they had fewer than 250 admissions during a 5-year time period

 Ground-truth label (reference) for Acute Kidney Injury

  • Based on the three KDIGO accepted AKI definitions

    • an increase in serum creatinine of 0.3 mg/dl (26.5 μmol/l) within 48 h

    • an increase in serum creatinine of 1.5× the baseline creatinine level of a patient, known or presumed to have occurred within the previous 7 days

    • a urine output of <0.5 ml/kg/h over 6 h. As it happens, this information did not make it into the EHRs so this criterion wasn’t really used.

Evaluation of dataset

A total of 703,782 adult patients were eligible. The dataset was randomly divided into 4 sets: training (80%), validation (5%), calibration (5%), and testing sets (10%). The investigators were blinded to the group assignments. But this training and validation is different than the usual prediction score validation one might think. The validation set was used to iteratively improve the models by selecting the best model architectures and hyperparameters. Then the models selected on the validation set were recalibrated on the calibration set in order to further improve the quality of the risk predictions.

Models for predicting AKI

The authors utilized a specific technique for deep learning known as a recurrent neural network. This network helps assess sequential data obtained over time. Using the electronic health records, the data is processed one step at a time and building on internal memory that keeps track of relevant information seen up to a specific time period. When the predicted probability exceeds a specified operating-point threshold, the prediction is considered positive. The methods describe the steps as

  • embedding module (containing easy to understand language like ‘The embedding layers transform the high-dimensional and sparse input features into a lower-dimensional continuous representation that makes subsequent prediction easier. We use a deep multilayer perceptron with residual connections and rectified-linear activations. We use L1 regularization on the embedding parameters to prevent overfitting and to ensure that our model focuses on the most-salient features.’)

  • recurrent neural network core

  • prediction target and training objectives

  • training and hyperparameters

Extended Data Figure 2 from Tomasev et al, Nature 2019

Extended Data Figure 2 from Tomasev et al, Nature 2019

What kind of data was entered into the model, and who decided that? Dive into the supplementary data (E, F). They did a systematic review of existing models, used clinician input, and added other data to end up with 315 base features of demographics, admission information, vital sign measurements, select laboratory tests and medications, and diagnoses of chronic conditions that are directly associated with an increased risk of AKI, see below.

Extended Data Figure 1 from Tomasev et al, Nature 2019

Extended Data Figure 1 from Tomasev et al, Nature 2019

Results

The characteristics of the training set show 6.9% female, 18.9% black, and 10% with diabetes. The incidence of KDIGO AKI was 13.4% of admissions.

Extended Data Table 6 from Tomasev et al, Nature 2019

Extended Data Table 6 from Tomasev et al, Nature 2019

Prediction of AKI:

Using the 33% precision model, 55.8% of inpatient AKI events were predicted within 48 hours in advance. A ratio of 2 false predictions for every true positive.

Extended Data Table 4a from Tomasev et al, Nature 2019

Extended Data Table 4a from Tomasev et al, Nature 2019

The model performed better at predicting severe AKI requiring dialysis, providing correct predictions of 84.3% of episodes in which dialysis was required within 30 days of onset of AKI, and 90.2% of episodes in which dialysis was scheduled within 90 days of onset of AKI.

ROC curves and Precision Recall Curves

ROC Curves summarize the trade-off between the true positive rate and false positive rate for a predictive model using different probability thresholds.

Precision-Recall curves summarize the trade-off between the true positive rate and the positive predictive value for a predictive model using different probability thresholds.

Figure 2 from Tomasev et al, Nature 2019

Figure 2 from Tomasev et al, Nature 2019

Analysis of false positive prediction

Of all the false-positive alerts, 25% were positive predictions that were made even earlier than the 48-hour window in patients who subsequently developed AKI. Among these patients who had earlier alerts, 57% occurred in patients with pre-existing chronic kidney disease. For the remaining false positive alerts, 24% alerts occurred only after the AKI event, and up to 51% alerts occurred among patients who did not experience AKI.

Extended Data Figure 4 from Tomasev et al, Nature 2019

Extended Data Figure 4 from Tomasev et al, Nature 2019

Figure 3 from Tomasev et al, Nature 2019

Figure 3 from Tomasev et al, Nature 2019

Subgroup Analysis

Supplementary data from Tomasev et al, Nature 2019

Supplementary data from Tomasev et al, Nature 2019

The supplementary data contains some interesting individual examples of early detection (successful) and some fails (AKI predicted which did not occur). One also gets a sense of what the probabilities of AKI (see y-axis) actually mean. A couple of examples below (more in the supplement).

Screen Shot 2019-09-21 at 10.21.09 PM.png

Discussion

The risk prediction model can identify 55.8% of AKI 48 hours in advance and more so able to predict up to 90% of severe AKI that require dialysis. However, it is important to note that the false positive rate is high. There are several limitations to this study.

External Validity

  • First, the demographic in the Veterans Affairs healthcare system, where females comprised only 6.38% of the patients in this dataset, is not representative of the global population. Thus the proposed system would still need to be validated using data with additional representative datasets.

  • Additionally, it might be more accurate for hospital settings and for more severe forms of AKI for those requiring dialysis

Internal Validity

  • AI models based off of retrospective studies have often not worked in prospective studies, so this should be studied prospectively

  • Many false positives particularly for patients with CKD

Does it Matter?

How will you use this information? Diagnosis is only the first step. Do we have an early intervention in AKI that works? An intervention that reduces the risk of the impending AKI, or changes it course to reduce the morbidity? Perhaps even decrease risk of dialysis or mortality - aka outcomes that count? That is the real test. Moreover, from a different AKI alert trial, we have seen that many times these alerts did not result in any change in management. And any intervention that resulted from these warnings would have to address the high false positive rate. Giving IV fluids to who was not going to get AKI is not innocuous.

Despite its limitations, this study is unique as it predicts the development AKI more accurately in comparison to other models is focused on prevention rather than the downstream outcomes from injury as done previously. With early notification and early clinical intervention, the use of deep learning can be a potentially promising way of alerting clinicians of organ compromise.

Summary by Jia Hwei Ng

Nephrologist, New York

NSMC Intern, Class of 2019