Vitamin K antagonists (VKAs) were introduced into clinical practice in the 1940s, and have been used as anticoagulants for over 80 years.1 The advantages of VKAs include their oral administration and demonstrated efficacy across a broad spectrum of thrombotic disorders, including thrombosis associated with AF.2,3 However, a significant limitation of VKAs is their narrow therapeutic range, which is influenced by various factors, including dietary habits, concomitant medications and individual metabolic variations.2 This complexity renders the clinical use of VKAs challenging, particularly in maintaining the international normalised ratio (INR) within the target range of 2–3, thereby hindering the optimisation of their therapeutic benefits.4
A new era in stroke prevention for patients with AF has emerged following the introduction of direct oral anticoagulants (DOACs) in AF management guidelines in 2010.5 Currently, DOACs are widely recognised as the mainstay for stroke prevention in patients with AF, except for those with moderate to severe mitral stenosis or mechanical prosthetic valves.6,7 Not only do DOACs address the drawbacks associated with VKAs, but they also have well-established advantages in terms of clinical profile compared with VKAs.8 In addition, the benefits of DOACs regarding non-thrombotic events, such as renal complications and dementia, are gradually being explored.9,10 Newer anticoagulants, Factor XIa inhibitors, are under investigation as potential alternatives to DOACs, with the aim of further reducing the risk of bleeding. However, the initial published results from clinical trials suggest that these newer anticoagulants may be less effective than DOACs, indicating that DOACs are likely to remain the standard of care for AF in the foreseeable future.11
Despite the widespread use of DOACs in clinical practice, the absence of head-to-head randomised controlled trials (RCTs) creates significant uncertainty regarding their comparative efficacy and safety. Each agent has unique pharmacological properties and safety profiles, yet, without direct comparisons, clinicians face challenges in selecting the most appropriate agent for individual patients. Several expert opinions have been formulated to address this clinical gap.12–14 However, relying on real-world evidence and subgroup analyses for recommendations has inherent limitations. Therefore, obtaining high-quality evidence comparing DOACs not only addresses the existing data gap but also provides critical information to help clinicians make informed clinical decisions.
The primary objective of this systematic review and network meta-analysis is to address the existing data gap concerning the comparative efficacy and safety of oral anticoagulants, including DOACs and VKAs, by synthesising data from RCTs to provide comprehensive insights into their relative efficacy and safety profiles. Furthermore, the findings will facilitate informed clinical decision-making by offering a clearer understanding of how these agents perform in various patient populations. Ultimately, this review seeks to enhance the evidence base for anticoagulation therapy, guiding clinicians in selecting the most appropriate treatment options for patients requiring anticoagulation.
Methods
Protocol and Registration
This systematic review and network meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and the PRISMA extension specifically for network meta-analyses (PRISMA-NMA).15,16 The study protocol was registered with PROSPERO under registration number CRD42024588472.
Search Strategy and Selection Criteria
The search for relevant literature was conducted across three databases, namely PubMed, Embase and Web of Science, encompassing records from their inception until 31 August 2024. The search used relevant keywords and medical subject headings (MeSH) terms, including ‘apixaban’, ‘dabigatran’, ‘edoxaban’, ‘rivaroxaban’ and ‘atrial fibrillation’. A detailed description of the search strategy is provided in Supplementary Table 1.
Studies were included in this systematic review and network meta-analysis if they were RCTs conducted on patients diagnosed with AF aged ≥18 years and involved the use of oral anticoagulants for the prevention of stroke or systemic embolism. The intervention duration had to be longer than 6 months, and studies were required to be published in English. In addition, included studies needed to report on the specified outcomes of interest.
Studies involving patients with AF and moderate to severe mitral stenosis, mechanical prosthetic valves or rheumatic AF were excluded from this analysis, as were studies that combined multiple antithrombotic agents with DOACs without a consistent strategy between intervention and control arms. Phase II dose-finding studies, preclinical studies, phase I pharmacokinetic studies, RCTs with an intervention duration shorter than 6 months and studies published in languages other than English were also excluded.
Outcomes
The primary study endpoint was major ischaemic events, which included stroke or systemic embolism (SSE), MI, major bleeding (MB) as classified by the International Society on Thrombosis and Haemostasis (ISTH)17 and all-cause mortality (ACM). The composite outcome of efficacy was defined as a combination of major ischaemic events, including SSE, MI and ACM. In addition, the net clinical benefit (NCB) was defined as the composite of SSE, MI, ACM and MB, providing a comprehensive evaluation of the overall effectiveness and safety of the interventions under investigation.
Data Extraction and Statistical Analysis
Two reviewers (TDM, MCT) independently conducted the review and data extraction using Covidence. In the event of a disagreement between the two reviewers, a third reviewer (HMP) was consulted, with the final decision based on a consensus among the three reviewers following a discussion.
Data for the network meta-analysis were collected at the arm level. In studies reporting multiple datasets, data from the safety-as-treated population, defined as individuals who received at least one dose of the study drug, were prioritised. Data synthesis was performed using RR with accompanying 95% CIs. A frequentist random effects model was employed to account for variability among studies, using the ‘netmeta’ package in R, which is specifically designed for network meta-analysis. Heterogeneity across studies was assessed using the I2 statistic, with a high I2 (≥50%) value indicating considerable variability among included studies. To rank the interventions, the surface under the cumulative ranking curve (SUCRA) was calculated for each treatment, reflecting the likelihood of each intervention being the optimal option. In addition, a net league was constructed to visualise direct and indirect comparisons among interventions.
Quality Assessment
Two reviewers (TDM and MCT) independently conducted a quality assessment of the included studies. In the event of a disagreement between the two reviewers, a third reviewer, (HMP) was consulted, with the final decision based on a consensus among the three reviewers following a discussion. The Risk of Bias 2 (RoB2) tool, developed by the Cochrane Collaboration, was used to evaluate the methodological quality of the studies included in the analysis.18 The RoB2 tool assesses potential biases across five major domains: the process of randomisation; deviations from intended interventions; handling of missing outcome data; accuracy in outcome measurement; and selection of reported results. Each domain is classified as having either a low risk, some concerns or a high risk of bias, thereby facilitating a comprehensive assessment of the overall risk of bias for each study.
Results
Included Studies
In all, 23,152 records published before 31 August 2024 were identified from the literature database search. After excluding 14,022 records due to duplicates and non-RCTs that were automatically filtered out by the system, 9,130 records remained for title and abstract screening. After screening, an additional 9,065 records were excluded, resulting in 65 records selected for full-text review. Ultimately, 11 RCTs that met the inclusion criteria were chosen for analysis.19–29 All included studies compared DOACs with VKAs, but no head-to-head comparisons were conducted between the different DOACs. A diagram of the selection process is shown in Figure 1.
There were 76,128 patients in the 11 included studies. The pooled prevalence of comorbidities was 47.1% for heart failure, 87.3% for hypertension, 31.0% for diabetes and 29.4% for a history of stroke. The pooled prevalence of patients aged >75 years was 38.7%, and 19.6% of the pooled population had chronic kidney disease or was on haemodialysis. The details of the included studies are presented in Table 1.
Risk of Bias
Based on the evaluation of all included studies using the designed questionnaire to implement the RoB2 tool, nine of 11 studies were assessed to have a low risk of bias. One study was classified as having a high risk of bias due to missing data.29 Another study was deemed high risk due to issues in the measurement of outcomes.28 The results of RoB2 assessment of the risk of bias are shown in Figure 2. Publication bias was assessed using funnel plots for various outcomes, with results shown in Supplementary Figures 1–6.

Stroke or Systemic Embolism
SSE events were reported in 11 studies involving 76,003 patients. Rivaroxaban significantly reduced SSE compared with VKAs (RR 0.63; 95% CI [0.46–0.86]; p=0.0034). However, the efficacy of the remaining DOACs in preventing SSE compared with VKAs, as well as comparisons among the DOACs, was not statistically significant. There was no significant heterogeneity between studies: τ2=0.0278; τ=0.1668; I2=20%; 95% CI [0.0–62.5%]. The treatment ranking indicated that rivaroxaban had the highest probability of being the most effective treatment, with a SUCRA of 0.90, whereas VKAs were the least effective treatment for preventing SSE, with a SUCRA 0.12. A net graph, forest plot and league table for SSE are shown in Figure 3. The ranking of probability for SSE is shown in Supplementary Figure 7.
Major Bleeding
MB events were reported in 11 studies, involving 75,960 patients. Ten of the 11 studies selected for analysis reported MB according to ISTH criteria. One study did not specify ISTH MB, but the definition of MB in that study was consistent with ISTH criteria and it was therefore included in the analysis.29 Pairwise comparison results indicated that the risk of MB did not differ statistically between the oral anticoagulants. There was significant heterogeneity between studies: τ²=0.1827, τ=0.4274, I2=79.8%; 95% CI [60.6–89.6%]. In terms of treatment ranking, there was no major difference in SUCRA between interventions. Rivaroxaban had the highest SUCRA of 0.67, followed by apixaban (0.55), dabigatran (0.52), edoxaban (0.43) and VKA (0.33). A net graph, forest plot and league table for MB are shown in Figure 4. The ranking of probability for MB is shown in Supplementary Figure 8.
MI
MI events were reported in seven studies involving 74,359 patients. Dabigatran was associated with an increased risk of MI compared with VKA (RR 1.38; 95% CI [1.04–1.84]; p=0.0263), apixaban (RR 1.59; 95% CI [1.06–2.36]; p=0.0238) and rivaroxaban (RR 1.69; 95% CI [1.15–2.49]; p=0.0075). In terms of treatment ranking, rivaroxaban and apixaban were the most likely to be the best treatments, with SUCRA values of 0.87 and 0.78, respectively, whereas dabigatran was associated with the lowest SUCRA (0.03). There was no significant heterogeneity among the studies: τ2=0; τ=0; I2=0%; 95% CI [0.0–84.7%). A net graph, forest plot and league table for MI are shown in Figure 5. The ranking of probability for MI is shown in Supplementary Figure 9.
All-cause Mortality
ACM events were reported in 10 studies involving 75,650 patients. Rivaroxaban (RR 0.84; 95% CI [0.72–0.99]; p=0.0349) and edoxaban (RR 0.90; 95% CI [0.83–0.97]; p=0.0075) were associated with a reduced risk of ACM compared with VKA. Apixaban (RR 0.91; 95% CI [0.82–1.01]; p=0.0729) and dabigatran (RR 0.90; 95% CI [0.81–1.01]; p=0.0627) also tended to reduce ACM compared with VKAs, although the differences were not statistically significant. The risk of ACM among the DOACs did not differ statistically. There was no significant heterogeneity among the studies, with τ²=0, τ=0 and I2=0%; 95% CI [0.0–70.8%]. In terms of treatment ranking, rivaroxaban was the most likely to be the best treatment, with a SUCRA of 0.83, followed by the three DOACs, which showed no major differences in SUCRA (edoxaban, 0.57; dabigatran, 0.55; apixaban, 0.55). VKAs were associated with the lowest SUCRA of 0.03. A net graph, forest plot and league table for ACM are shown in Figure 6. The ranking of probability for ACM is shown in Supplementary Figure 10.
Composite Outcome of Efficacy
The composite outcome of efficacy was summarised across 11 studies involving a total of 76,100 patients. All four DOACs demonstrated superiority over VKAs in reducing major ischaemic events and mortality. Rivaroxaban was associated with a reduced risk of the composite outcome of efficacy compared with dabigatran (RR 0.85; 95% CI [0.75–0.98]; p=0.02) and edoxaban (RR 0.84; 95% CI [0.75–0.95]; p=0.0051). However, the difference between rivaroxaban and apixaban did not reach statistical significance (RR 0.89; 95% CI [0.89–1.02]; p=0.087). Pairwise comparisons among the three DOACs (apixaban, edoxaban and dabigatran) did not reach statistical significance. There was no significant heterogeneity between studies: τ²=0, τ=0 and I2=0%; 95% CI [0.0–67.6%]. In terms of treatment ranking, rivaroxaban was the most likely to be the best treatment, with a SUCRA of 0.99, followed by apixaban (0.66), dabigatran (0.46) and edoxaban (0.38). VKAs were associated with the lowest treatment ranking, with a SUCRA of 0.01. A net graph, forest plot and league table for the composite outcome of efficacy are shown in Figure 7. The ranking of probability for the composite outcome of efficacy is shown in Supplementary Figure 11.
Net Clinical Benefit
Composite outcomes of NCB were summarised in 11 studies involving a total of 76,100 patients. Rivaroxaban significantly reduced composite outcomes of NCB compared with VKAs (RR 0.75; 95% CI [0.59–0.94]; p=0.0133). However, the comparative impacts of the remaining DOACs on composite outcomes of NCB compared with VKA, as well as comparisons among the DOACs, were not statistically significant. There was significant heterogeneity between studies: τ2=0.0399, τ=0.1998 and I2=75.7%; 95% CI [51.2–87.9%]. In terms of treatment ranking, rivaroxaban was the most likely to be the best treatment, with a SUCRA of 0.89, followed by the three DOACs, which showed no major differences in SUCRA (dabigatran, 0.51; apixaban, 0.47; edoxaban, 0.39). VKAs were associated with the lowest SUCRA of 0.24. A net graph, forest plot and league table for NCB are shown in Figure 8. The ranking of probability for NCB is shown in Supplementary Figure 12.
Transitivity and Consistency Assumptions
We conducted a network meta-analysis based on the transitivity assumption, which posits that the distribution of effect modifiers is highly similar among the included studies. This assumption is supported by the fact that all included studies share similar designs because they are all RCTs. The study population specifically focused on patients with AF who had indications for oral anticoagulation. All DOACs were compared against the same comparator, a VKA. Notably, the defined outcomes were highly consistent across the studies. Therefore, the indirect comparison is valid for assessing the efficacy and safety among the DOACs. However, because all included studies compared DOACs to VKAs, the data available for assessing consistency are limited.
Discussion
In this systematic review and network meta-analysis of 11 RCTs involving more than 76,000 AF patients, we evaluated the efficacy and safety of oral anticoagulants across various outcomes. The outcomes selected for our study, including both composite and component outcomes, are consistent with the European Medicines Agency recommendations for conducting clinical trials in AF.30 Although numerous real-world studies have compared DOACs, their inherent design limitations and mixed results suggest that these findings are better suited for confirming the effectiveness and safety of DOACs in clinical practice rather than serving as definitive evidence for comparative efficacy and safety.31,32 In this context, the present study, which uses data from well-designed RCTs, will provide valuable insights, particularly as head-to-head comparisons among DOACs are not yet available.
The findings of our study indicate that DOACs are superior to VKAs in preventing major ischaemic events and mortality, as evidenced by both composite and component outcomes. Rivaroxaban ranked highest in treatment efficacy, with no major differences among the other DOACs. Notably, dabigatran was associated with an increased risk of MI compared with VKAs, apixaban and rivaroxaban. This observation is consistent with previous research.33,34 In well-designed RCTs involving high-risk MI populations, dabigatran-based regimens were associated with a non-significant increase in MI events compared with VKA-based regimens, whereas no such increase was observed with apixaban- or rivaroxaban-based regimens.35–37 Several underlying mechanisms have been proposed to explain the unfavourable effect of dabigatran on MI outcomes. One frequently mentioned hypothesis is that dabigatran is less effective than warfarin in preventing MI events.20 Another proposed hypothesis regarding the mechanism of action of dabigatran compared with Factor Xa inhibitors is that targeting Factor Xa for upstream inhibition may offer more effective suppression of the thrombin burst than inhibiting thrombin downstream.38 Data from basic scientific studies indicate that Factor Xa is more thrombogenic than thrombin, with the activation of a single molecule of Factor Xa capable of generating up to 1,000 molecules of thrombin.38 Consequently, targeting Factor Xa may represent a more effective strategy than targeting thrombin directly.38,39 This hypothesis is supported by data showing that Factor Xa inhibitors reduced the risk of MI compared with placebo.40 The hypothesis of a direct or indirect effect of dabigatran on platelet activation has also been suggested but remains inconclusive.41 It should be noted that the observations regarding increased MI risk with dabigatran originated from studies not specifically designed to assess MI outcomes. However, these findings suggest that more careful consideration and prophylactic measures are warranted for patients at high risk of MI who are taking dabigatran.
Regarding safety outcomes, the risk of MB did not differ significantly between the oral anticoagulants. These results are consistent with findings from meta-analyses of pivotal RCTs.8 However, real-world data generally favour DOACs regarding MB events.42 This discrepancy may be attributed to the significant gap between the proportion of patients achieving INR within the therapeutic range in RCTs compared with routine clinical practice.43 Consequently, it is plausible that the superiority of DOACs over VKAs in terms of safety may be more pronounced in everyday clinical settings than indicated by our study. Furthermore, our findings underscore the importance of controlling risk factors to mitigate bleeding events, in addition to selecting anticoagulants with favourable safety profiles. Bleeding events typically occur in the presence of vascular damage, which is influenced by various risk factors and patient characteristics. Therefore, to effectively limit bleeding, a comprehensive strategy is required, encompassing the management of modifiable risk factors and ensuring treatment adherence, rather than solely relying on the clinical profiles of anticoagulants.
Based on the treatment ranking of NCB, rivaroxaban emerged as the DOAC with the balanced benefit–risk ratio for patients with AF. In contrast to the findings of the present study, several published real-world studies have shown that apixaban exhibits certain advantages over rivaroxaban, particularly regarding bleeding outcomes.31,44–47 The key difference between the present study and previously published studies is that our study focused exclusively on RCTs and did not include real-world studies. In addition, many of these real-world studies have relatively short follow-up periods, often less than 1 year, with numerous studies reporting follow-up times of only 3–6 months. Bleeding events in anticoagulant patients are typically ‘frontloaded’, occurring early and increasing rapidly, whereas thrombotic events and mortality rise steadily over time.48 Thus, short-duration observations may over-represent bleeding risk without capturing significant differences in thrombotic events and mortality, complicating the evaluation of the long-term NCB of anticoagulants.
In a real-world study with a 6-year follow-up, rivaroxaban demonstrated advantages over apixaban in terms of SSE, intracranial haemorrhage and mortality.32 Therefore, we believe that the present study, which focuses exclusively on RCTs with a follow-up duration exceeding 6 months, is an appropriate approach for assessing the NCB of DOACs. An important point to note is that our initial expectation was that the remaining three DOACs (i.e. apixaban, dabigatran and edoxaban) would not demonstrate superiority over VKAs in terms of NCB. The lack of clear superiority of DOACs over VKAs is primarily influenced by MB events. As discussed previously, this finding can be partly attributed to the effective INR control observed in the group of patients using a VKA in the RCTs. Moreover, the results from our study highlight the need to evaluate the comprehensive benefits of each specific anticoagulant when making treatment decisions for patients. Clinicians often prioritise bleeding over stroke events. However, cardiovascular events and worsening renal function are also prevalent and contribute significantly to mortality.43,49 Therefore, in clinical practice, it is essential to identify the clinical challenges that patients face, which can inform a strategy for selecting the most appropriate anticoagulant to maximise patient benefit.
Although our study aimed to provide insights based on robust data, it is not without limitations. First, the absence of direct head-to-head trials comparing DOACs restricts our ability to draw definitive conclusions regarding their relative safety and efficacy. This also limits the data available for assessing consistency. The application of strict inclusion criteria resulted in a relatively small number of studies included in the analysis, with only 11 studies selected from more than 23,000 records. However, we believe that focusing high-quality data, as mentioned previously, is a reasonable approach. Treatment rankings based on SUCRA from network meta-analysis without direct comparisons should be interpreted with caution, and clinical decisions should always be combined with clinical assessments in each specific case. Given the challenges associated with conducting well-designed RCTs that are adequately powered to demonstrate superiority among specific DOACs, it is essential that stroke prevention with anticoagulation strictly adheres to current clinical guidelines. Anticoagulation remains a critical component of AF management, and physicians must select the most appropriate anticoagulant for each patient based on individual clinical circumstances.
Second, several studies included in our analysis were underpowered to adequately assess the outcomes of interest, which further complicates the ability to draw conclusive results. Third, two of the selected studies were classified as at high risk of bias according to our assessment. However, given the small weight of these two studies in the overall analysis, we believe that their effect on the overall results is minimal. Fourth, excluding studies published in languages other than English may omit relevant local studies, potentially impacting the generalisability of findings, particularly for underrepresented regions. However, this practice is common in systematic reviews, and we believe that including any local studies would not significantly alter the current results. In addition, the variability in data reporting across studies represents another limitation that should be noted.
Finally, variability in the populations and study designs of the included studies may have introduced heterogeneity in MB outcomes, which, in turn, contributed to the heterogeneity observed in the NCB. However, this diversity also enhances the external validity of our findings, allowing them to more accurately reflect the varied clinical characteristics of patients encountered in routine practice.
Conclusion
The results of our network meta-analysis of RCTs indicate that DOACs are superior to VKAs in preventing major ischaemic outcomes and mortality, without an associated increase in the risk of MB. Furthermore, treatment ranking revealed that rivaroxaban has the most balanced risk–benefit profile among the DOACs for patients with AF. However, to establish definitive conclusions regarding the comparative effectiveness of DOACs, future research should prioritise well-designed head-to-head trials or advanced observational methods to address this critical gap in the literature.