Out-of-hospital cardiac arrest (OHCA) remains a major health problem and, despite advances in resuscitation care, long-term prognosis remains poor, with the global survival rate estimated at 10%.1
Many scoring systems have been developed to predict outcome in individuals with OHCA, in order to guide intervention and resource management, as well as to guide prognosis for healthcare teams and family and friends. Scoring systems typically incorporate multiple factors independently associated with OHCA outcomes, including cardiac arrest characteristics such as whether the arrest was witnessed, whether there was bystander cardiopulmonary resuscitation (CPR) and whether the initial rhythm was shockable.2 However, scores incorporate different variables and measure different outcomes, and there is no guide for clinicians to summarise the scores and enable tailoring of these to the relevant cohort.
The aim of this review was to identify, review and compare risk scores to predict outcomes in patients with OHCA.
We performed a MEDLINE database literature search from inception (January 1996) to December 2021, using a combination of the following keywords and medical subject heading terms: ‘out-of-hospital cardiac arrest’, ‘cardiac arrest’, ‘sudden cardiac death’, ‘score’ and ‘scoring system’.
Authors (RN, IM) independently analysed abstracts from the search results. Articles were assessed using a full text review, and duplicate results were excluded. Disagreements were resolved through consensus between authors.
We included studies that met the following eligibility criteria: (i) inclusion of a scoring system predicting outcome(s) after OHCA, (ii) description of a new score or validation of a previous score and (iii) inclusion of a clinically relevant outcome of interest.
We identified 137 studies of which 57 were duplicate entries and 60 were excluded due to irrelevance (Figure 1). A total of 17 scoring systems and five validation or comparative studies were included (Supplementary Table 1). Of these, one predicted the probability of return of spontaneous circulation (ROSC), six predicted in-hospital mortality, and 10 predicted neurological outcome at or after discharge (Figure 2).3–17
Scores to Predict Return of Spontaneous Circulation
This score was developed from a multicentre retrospective cohort of 5,471 individuals with OHCA.3 Unlike other scores in this review, this score aims to predict the probability of ROSC based on the following variables: sex, age ≥80 years, aetiology, witness of arrest, location of arrest, initial ECG rhythm, bystander CPR and emergency services arrival time. Internal validation in an independent retrospective cohort of 2,218 patients confirmed that the predicted ROSC rate (43.7%) reflected the observed ROSC rate (43.8%), with a c-statistic of 0.731 (95% CI [0.710–0.751]). External validation in a retrospective cohort of 2,041 patients produced a c-statistic of 0.76 (95% CI [0.74–0.78]).18 The RACA score is the only score externally validated to predict ROSC. Disadvantages include the large number of variables, making it complicated to calculate and apply in a scenario in which ROSC probability may guide immediate management of an arrest.
Scores to Predict Survival to Hospital Discharge
The NULL-PLEASE score is named after the variables used in the score: non-shockable rhythm, unwitnessed arrest, long no-flow period (no bystander CPR prior to arrival of emergency medical services), long low-flow period (defined as >30 minutes of CPR before ROSC), pH <7.2, lactate >7, end-stage renal failure on dialysis, age >85 years, still ongoing CPR upon hospital arrival and extra-cardiac cause.5 Non-shockable rhythm, unwitnessed arrest, long no-flow and low-flow times were assigned a weight of 2, and all other variables a value of 1 to give a maximum total score of 14. The score was used to predict in-hospital mortality: an increasing score from 0 to 6 predicted increased risk of non-survival to hospital discharge, with those with a score of 6 or more predicted to have 100% risk of in-hospital death. Internal validation in the same retrospective cohort of 56 patients admitted to the intensive care unit (ICU) following OHCA used to develop the score, showed that the score independently predicted mortality with an OR of 1.68 (95% CI [1.03–2.72]). The score has been externally validated in a UK multicentre cohort of 700 patients (300 retrospective and 400 prospective), demonstrating a c-statistic of 0.874 (95% CI [0.848–0.899]).19 It has been further externally validated in a recent retrospective cohort of 189 patients, with a c-statistic of 0.874 (95% CI [0.807–0.942]) for prediction of in-hospital mortality.20 A key disadvantage of the score is the reliance on pH and lactate, which may not always be available, and the use of other variables such as no-flow and low-flow times, which are often inaccurately recorded. A modified NULL-PLEASE score was subsequently developed, using the same variables but excluding pH and lactate due to lack of data in many patients.6 Internal validation in a new retrospective cohort of 547 patients had a c-statistic of 0.658 (95% CI [0.613–0.704]; p<0.001) to predict survival from arrest to hospital admission. For this cohort, patients with a modified NULL-PLEASE score of ≥5 had a 3.3-fold greater risk of fatal outcome compared with a score of 0–4 (OR 3.34; 95% CI [2.29–4.89]).
Pittsburgh Cardiac Arrest Category Score
The Pittsburgh Cardiac Arrest Category (PCAC) score was developed to predict outcomes for patients with in-hospital cardiac arrest (IHCA) and OHCA, from a retrospective cohort of 457 patients.7 The score was developed using variables from two existing scores, the Serial Organ Failure Assessment (SOFA) and the Full Outline of UnResponsiveness (FOUR) scores.21,22 Variables from the SOFA score include blood pressure and the partial pressure of arterial oxygen/fraction of inspired oxygen ratio, and those from FOUR include neurological motor and brainstem responses. The PCAC score was used to predict in-hospital mortality, development of multiorgan failure and ‘good outcome’, defined as hospital discharge. The score divides patients into four categories: awake (category I), moderate coma without cardiorespiratory failure (category II), moderate coma with cardiorespiratory failure (category III) and severe coma (category IV). Category I patients are predicted to have an 80% chance of survival to hospital discharge, and this probability is 60%, 40% and 10% for categories II, III and IV, respectively. External validation of the score in a retrospective cohort of 607 patients showed significant predictive value for in-hospital mortality with a c-statistic of 0.82 and adjusted OR of 0.31 (95% CI [0.22–0.44]).23 This score is easy to calculate, using data readily available following ROSC. A disadvantage is the derivation from a heterogeneous patient cohort including IHCA and OHCA patients.
The CREST (coronary artery disease, initial heart rhythm, low ejection fraction, shock at the time of admission, and ischaemic time >25 minutes) score was developed from a retrospective cohort of 638 patients admitted to ICU following OHCA attributed to non-ST-segment elevation MI (NSTEMI).4 The score was used to predict ‘circulatory aetiology death’ (CED), defined as death from repeat arrest, progressive refractory shock, refractory arrhythmia, lactic acidosis and multiorgan failure. Variables associated with CED included non-shockable rhythm, ischaemic time (low-flow) >25 minutes, known coronary disease, left ventricular ejection fraction >30% and shock on admission. Each was equally weighted to give a maximum score of 5. Validation in an independent retrospective cohort of 318 patients showed good correlation between predicted versus observed probability of CED for all scores between 0 and 5 at 7.1/10.2%, 9.5/11%, 22.5/19.6%, 32.4/29.6%, 38.5/30% and 55.7/50%, respectively, with a c-statistic of 0.73 in the development and 0.68 in the validation cohorts. Application of this score to predict mortality is limited to those with NSTEMI who have survived to reach the ICU, who by definition are a small, selected cohort.
The PEA score was developed from a cohort of patients with OHCA attributed to NSTEMI, to predict in-hospital mortality.8 Each of the following three variables is assigned equal weighting: pulseless (non-ventricular fibrillation) arrest, elderly (age >85 years) and acidosis (pH range not specified), with a c-statistic of 0.61 (95% CI [0.60–0.62]; p<0.001). Only an abstract was available for this study, and no external validation has been performed.
Glasgow Coma Scale
The Glasgow Coma Scale (GCS), a widely applied and recognised scoring system in clinical practice, was validated by Nadolny et al. in a prospective cohort of 218 OHCA patients.9 Although not originally designed for prediction of OHCA outcome, the authors carried out multivariate logistic regression of the GCS with traditional variables of eye, verbal and motor response, to assess correlation with in-hospital mortality. In the validation cohort, GCS predicted in-hospital mortality with a c-statistic of 0.735 (95% CI [0.655–0.816]) and OR of 6.4 (95% CI [2.0–20.3]). Advantages of this score include its familiarity to clinicians and therefore its accuracy and applicability in clinical use to predict OHCA outcome. Disadvantages include a lack of external validation in larger cohorts.
Scores to Predict Neurological Outcome
Cardiac Arrest Hospital Prognosis Score
The Cardiac Arrest Hospital Prognosis (CAHP) score was developed in a prospective cohort of 819 patients admitted to ICU following OHCA, to predict poor neurological outcome, defined as cerebral performance category (CPC) 3–5 at hospital discharge.11 Variables associated with poor outcome were non-shockable rhythm, arterial pH, age, arrest setting, no-flow time, low-flow time and dose of adrenaline given during the arrest. Variables were added together to give a score out of 350. Based on the score, patients were divided into three groups: low (≤150), medium (150–200) and high (≥200) risk score, which were associated with a 29.6%, 86.3% and 99% chance of CPC 3–5 at discharge, respectively. Similar figures of 33.3%, 80.5% and 98%, respectively, were observed in the validation cohort, consisting of the same development cohort but also an additional new retrospective cohort of 367 patients. External prospective validation in 412 OHCA patients, in which no-flow time was omitted due to inaccurate records, showed a c-statistic of 0.82 (95% CI [0.77–0.86]; p=0.19) for favourable neurological outcome.17 This is the only score that has been externally validated for predicting neurological outcome, but variables such as no-flow time and low-flow time are often inaccurately recorded retrospectively.
The C-GRApH score was developed from a retrospective cohort of 122 OHCA patients admitted to the ICU and treated with therapeutic hypothermia, to predict CPC at hospital discharge.24 The following variables were assigned equal weighting to give a total maximum score of 5: known coronary disease, glucose ≥11.1 mmol/l on admission, non-shockable rhythm, age >45 years and arterial pH ≤7.0. A score of 0–1, 2–3 and 4–5 was associated with a 70%, 22% and 0% chance of favourable neurological outcome, respectively, with an overall c-statistic of 0.82 (95% CI [0.74–0.90]; p<0.001). The positive predictive value of a low score (0–1) was 70% for a favourable neurological outcome, and for a high score (4–5) it was 100% for poor neurological outcome. Internal validation in an independent retrospective cohort of 344 OHCA patients admitted to the ICU and treated with therapeutic hypothermia, showed favourable neurological outcome in 70%, 19% and 2% of patients with scores in the range 0–1, 2–3 and 4–5, respectively, with a c-statistic of 0.81 (95% CI [0.76–0.87]; p<0.001). The positive predictive value of a low score (0–1) was 70% for a favourable neurological outcome, and for a high score (4–5) it was 98% for a poor neurological outcome. The simplicity of using only five variables makes this score quick and easy to calculate, but it has been used only in those who have already survived to reach the ICU, thereby limiting its use in other patients.
The OHCA score was developed from a very small retrospective cohort of 96 patients with OHCA, to predict CPC at hospital discharge.13 The following variables associated with a favourable CPC score were each equally weighted to give a maximum score of 3: initial shockable arrest rhythm, ROSC ≤20 minutes and a brainstem reflex score ≥3 within 24 hours. Internal validation was performed in the same development cohort of 96 patients. Patients with a score of 1, 2 or 3 had a 12%, 64% and 86% likelihood of favourable neurological outcome, respectively, with a c-statistic of 0.84 (95% CI [0.75–0.93]) in the development cohort and 0.92 (95% CI [0.87–0.98]) in the validation cohort. In both cohorts, sensitivity for predicting good neurological outcome with a score ≥2 was 79%, and for a score ≥1 it was 100%. This score is simple to calculate within 24 hours to predict long-term neurological outcome, but it has been assessed only in small cohorts, limiting the conclusions that can be drawn about usefulness.
The SALTED (shockable rhythm, age, lactate, time elapsed until ROSC and diabetes) score was developed in a retrospective cohort of 153 patients with OHCA treated with therapeutic hypothermia, to determine mortality or a CPC score of 3–5 at 6 months.12 The score had a sensitivity of 79.6% and a specificity of 84.6% for 6-month mortality in the development cohort. On internal validation in an independent retrospective cohort of 91 patients, the score had a c-statistic of 0.82 (95% CI [0.73–0.91]), sensitivity of 73.5% and specificity of 78.6% in predicting poor neurological outcome at 6 months but was not useful for predicting in-hospital mortality. An advantage of this score is its availability as a smartphone application, but it has been assessed only in small cohorts, and only in those who have survived to reach the ICU.
The Cardiac Arrest Survival Score (CRASS) was developed from a retrospective cohort of 7,985 patients and uses 12 variables to predict hospital discharge with good neurological outcome (CPC category 1 or 2, or a modified Rankin scale score of 0, 1 or 2).14 These are age, rhythm, aetiology, support, adrenaline dose, pre-emergency disease status, location of arrest, amiodarone use, blood pressure on admission, witnessed arrest, duration of CPR and down-time. The model predicted good neurological outcome with a c-statistic of 0.88 (95% CI [0.87–0.89]). Internal validation in an independent retrospective cohort of 1,806 patients showed that the model predicted good neurological outcome with a c-statistic of 0.88 (95% CI [0.86–0.90]).
The Brain Death After Cardiac Arrest (BDCA) score was developed from a retrospective cohort of 569 patients to predict brain death on admission.10 Brain death was diagnosed clinically with the help of objective measures such as computed tomography and EEG. The following variables, assigned equal weight, were used: gender, non-shockable rhythm, cardiac aetiology, neurological aetiology, sodium level (mmol/l) at 24 hours, and any use of vasoactive drugs at admission and 24 hours. The score predicted brain death with a c-statistic 0.817 (95% CI [0.768–0.861]) in the development cohort and 0.805 (95% CI [0.755–0.855]) in an independent prospective internal validation cohort of 487 patients. The Hosmer–Lemeshow test indicated a good calibration in the development cohort (χ2 10.2; d.f. 8; p=0.25) and in the internal validation cohort (χ2 9.8; d.f. 8; p=0.28). An advantage of this scoring system is the ease of availability of the variables that comprise the score, as well as the fact that the variables are less prone to subjective and inaccurate recording. Disadvantages of this score would be the confounding use of cohorts used originally as part of prior randomised controlled trials assessing the efficacy of cyclosporine and erythropoietin in post-arrest outcome, for the developmental cohort. As well as the fact that differentiating between cardiac and neurological aetiology in clinical practice is often difficult.
The post-cardiac arrest syndrome for therapeutic hypothermia (CAST) score was developed from a retrospective multicentre cohort of 77 therapeutic hypothermia post-arrest patients.15 Outcome of interest was CPC at 30 days. Equally weighted variables were initial rhythm, witness/ROSC time, pH, lactate, motor GCS score, grey–white matter differentiation ratio, albumin and haemoglobin. The score predicted CPC outcome at 30 days in the developmental cohort with a sensitivity of 0.85, specificity of 0.84 and percentage correct classification of 0.85. A formal c-statistic was not available in this letter to the editor. In an independent internal validation retrospective cohort of 74 patients, the predictive accuracies included a sensitivity of 0.95, specificity of 0.90, percentage correct classification of 0.93 and a c-statistic of 0.97 (95% CI not stated). The CAST score was externally validated in a retrospective cohort of 189 patients with a c-statistic of 0.860 (95% CI [0.777–0.944]).20 Advantages of the score include integration into a smartphone application. Disadvantages were the inclusion in the literature as a letter to the editor, meaning that further statistical analysis was not available, as well as the use of small cohort sizes in the original study due to the applicability only to therapeutic hypothermia patients. The authors subsequently developed the revised CAST (rCAST) score, removing albumin and haemoglobin as variables and assigning double weighting to the remaining variables from the original CAST score.16 The rCAST was subsequently validated in a larger retrospective cohort of 460 patients, with a sensitivity and specificity of 0.95 (95% CI [0.92–0.98]) and 0.47 (95% CI [0.40–0.55]) for a low rCAST (score ≤5.5), 0.62 (95% CI [0.56–0.68]) and 0.48 (95% CI [0.40–0.55]) for a moderate rCAST score (6.0–14.0), and 0.57 (95% CI [0.51–0.63]) and 0.95 (95% CI [0.91–0.98]) for a high rCAST score (≥14.5). The c-statistic for predicting CPC at 30 and 90 days was 0.892 (95% CI not stated) and 0.895 (95% CI not stated), respectively. This was externally validated by the same authors and cohort as the original CAST score with a c-statistic of 0.770 (95% CI [0.659–0.880]).
The SLANT score was developed in a retrospective cohort of 305 patients treated with post-arrest therapeutic hypothermia.17 The score was designed to predict CPC at discharge, using the following variables assigned equal weighting: non-shockable rhythm, leucocyte count, total adrenaline dose, presence of onlooker CPR and total resuscitation duration.
The score predicted CPC at discharge with a c-statistic of 0.852 (95% CI [0.800–0.903]) in the developmental cohort. A score ≥6.5 predicted poor neurological outcome at discharge with a sensitivity of 84.1% and specificity of 70.9%. Subsequent internal validation was performed in an independent retrospective cohort of 60 patients, demonstrating a c-statistic of 0.917 (95% CI [0.844–0.989]). Disadvantages of the SLANT score include the small validation cohort size and the lack of external validation.
The MIRACLE2 score was developed in a retrospective cohort of 373 OHCA patients.25 The score was designed to predict poor neurological outcome, defined as a CPC score of 3–5 at 6-month follow-up. The following variables were used: unwitnessed arrest, non-shockable rhythm, pupil reactivity, age, changing rhythm (any two of ventricular fibrillation, pulseless electrical activity and asystole), pH <7.2 and use of adrenaline. Use of adrenaline scored 2 points, age 60–80 years scored 1, age >80 years scored 2 and all other variables score 1 point, giving a total score out of 10. The score predicted poor neurological outcome in the developmental cohort with a c-statistic of 0.9 (95% CI [0.865–0.928]). Internal validation was performed in two new independent retrospective cohorts of 325 and 148 patients, achieving a c-statistic of 0.84 (95% CI [0.829–0.846]) and 0.91 (95% CI not stated), respectively, as well as a calibration slope of 0.744 and 0.834, respectively.
In this paper we review and compare the characteristics of the 17 available scoring systems that guide prognosis in individuals with OHCA. Unlike IHCA, OHCA cohorts are more heterogeneous due to variability in management by emergency medical services.
There are many potential benefits in using prediction scores at various points during a patient’s post-OHCA journey. Scores that predict hospital survival can identify patients who respond to intervention, as well as help identify goals of care or ceilings of care for patients who survive the initial arrest but remain critically ill. It also helps the healthcare team to set expectations for family, friends and carers. Prediction of survival and neurological outcome also helps with post-discharge care and rehabilitation.
All prediction scores, however, have limitations and drawbacks. No score is 100% sensitive or specific and thus there remain concerns over false reassurance or, even worse, the potential risk of withholding or denying treatment to patients who are deemed to have a poor prognosis based on the score. Additionally, there is likely to be controversy on what risk of mortality or neurological recovery is sufficient for treatment to be considered futile, and of course will vary not only among healthcare professionals but also among relatives and carers.
Current knowledge and decision-making most often do not involve risk scoring, however, which means that decision-making and guidance of relatives is more ad hoc, subjective and emotive. Use of a risk score may help to improve on that and iron out potential inconsistencies and subjectivity between healthcare teams or between healthcare professionals in the information given to relatives and friends.
The ideal OHCA risk scoring system should (i) use a clinically relevant outcome that informs management decisions, (ii) use variables that are easily and routinely available early after OHCA, (iii) involve variables that are easy to access and use, (iv) have high predictive value, that is, high sensitivity and specificity, and high positive predictive value for death with a low false-positive rate, (v) be specific to the population of interest and (vi) be prospectively externally validated. We discuss each of these aspects below.
The outcome measure for a score must be one that can usefully inform clinical decision-making. However, determining the ideal outcome measure that all future scores are recommended to follow is difficult, given that different clinicians and patients may vary in what they consider to be important. The two outcomes most useful in the scores are survival and neurological outcome. Importantly, it may be imperative to be able to adequately predict both of these outcome measures at different points in the patient pathway. Early on, immediately following OHCA, survival may be the most important guide for next of kin. For predicting neurological outcome the ideal score would, at an early stage, identify patients who are likely to have a poor outcome such that escalation of care and/or attempts at resuscitation would not be in the patients’ best interest. The objective data output from the score would serve as an adjunct to the overall clinical decision-making process.
Use of appropriate variables is a fundamental step in the development of a score, and this includes variable selection, determining appropriate cut-offs for each variable and assigning weighting to the variables. The variables most consistently used in scores predicting survival include non-shockable rhythm and a long low-flow period. The variables most consistently predicting neurological outcome included non-shockable rhythm, age, long no-flow and low-flow periods and pH. A major limitation of using scores is the availability and quality of data at the time of cardiac arrest. For example, the no-flow and low-flow periods are often not available or are inaccurately recorded, even in witnessed cardiac arrests. The arterial blood gas is often not recorded, limiting the availability and use of pH and lactate in risk scores. Aetiology is often not known and has the potential to be recorded inaccurately. Inaccurate data limits the use of the score, even if the variable has high predictive power. The variables in a score should be easily available at the time of presentation to the emergency services or to hospital. Ideally, these should be objective and less prone to error in calculation or documentation. Reducing the inconsistencies in the same variable in different sites facilitates the process of external validation and helps with the creation of a universal score. Use of electronic patient records, ideally shared across various services such as ambulance crew, and secondary and primary care, could offer a solution to ensure data quality.
Difficulty in using a score is another major obstacle to usefulness. Ideally, scores should be easy to calculate by healthcare personnel, or easily calculable with an accessible calculator such as MDCalc, an app or integrated into major electronic medical record systems. The SALTED score addresses this issue with availability as a smartphone application. Difficulty in score calculation can significantly limit clinical use. Therefore, scoring systems should be designed to be as easy to use as possible.
The targeting of a score to a well-defined population is a key consideration. Some scores include both IHCA and OHCA, but the aetiologies can be different. Second, it is important to consider the cause of OHCA. Although scores may be applicable to the most common presentation of OHCA, namely secondary to a cardiac event, they may not be applicable to OHCA due to non-cardiac causes such as trauma. A total of 10 studies addressed this, either by using aetiology as a variable, measuring the cause of OHCA in the cohort or excluding patients with traumatic cause.3–5,9–14,17, Third, the post-arrest management of the cohort must be considered. Over the time period in which the score has been developed, best practice may have changed. For example, urgent revascularisation is now recommended only in the setting of ST-elevation. In addition, recent evidence suggests no benefit from therapeutic hypothermia, which is likely to change management away from therapeutic hypothermia, but patients may continue to receive targeted temperature management.26 As a result, different cohorts may have had different management, which may influence the predictive value of a score and its applicability to external cohorts.
The validation of the score is important. Ideally, scores should be prospectively externally validated in a large cohort. The issue with retrospective validation remains that the population is biased towards survivors, particularly for scores involving neurological outcome after discharge. Of the available scores, only five have been externally validated, with only two validated prospectively.3,7,9,11,15,16,27 With regard to some of the internally validated scores, two of these scores included participants from the development cohort in the internal validation cohort: ideally internal validation should be undertaken with a separate cohort to the development cohort.5,11 For predicting mortality, our recommended score would be the NULL-PLEASE score: it has the highest external validation with a large cohort size, and the variables used in the score are readily available at the time of admission and have high predictive value in logistic regression analysis for accurately predicting mortality. For predicting neurological outcome we would recommend the CAST score. This score has been externally validated and is easy to use given that it is organised into a smartphone application.
Finally, it is worth noting that all of the prediction scores discussed in this review have all been developed using a logistic regression model, which works under the assumption of a binary outcome. However, the clinical outcome (especially survival) is a time-dependent one, and the possibility of using alternative models that incorporate outcomes over time, such as a Cox regression, should be explored.
This paper, to our knowledge, is the only one that compares OHCA scores; initial work has been focused on a literature review. A meta-analysis examining the usefulness of scoring systems would be useful, but significant differences between the scores make this currently unfeasible. Further, some of the scores reviewed (PEA, NULL-PLEASE) are available only in abstracts, and it is possible that some data on these scores are incomplete. Additionally, there is difficulty in comparing small cohorts, which further limits our study.
A number of prediction scores have been developed, using different variables, in varying cohorts. We have proposed a set of criteria for an ideal scoring system, although additional work is needed to define the ideal variables to use. The NULL-PLEASE and CAST scores have been validated for mortality and neurological outcome, respectively, but further prospective work is needed before these can be incorporated into everyday practice. Future scores should be prospectively externally validated. Appropriate selection of the score to use, based on the development cohort, is important to optimally guide prognosis and to provide useful information for healthcare professionals and family and friends of the affected individual. Further research is required to establish whether use of these scores can lead to improvement in service delivery or more cost-effective healthcare.