ST-elevation MI (STEMI) represents a significant portion of the global cardiovascular disease burden.1 A primary mechanism of myocardial injury in STEMI patients is ischaemia-reperfusion, typically caused by primary percutaneous coronary intervention (PCI), which leads to reperfusion of the affected vessels and subsequent oxidative stress.2 Numerous studies have indicated that the oxygen burst associated with ischaemia-reperfusion is a key driver of the short-term production of reactive oxygen species (ROS).3,4 These effects are diverse and often exacerbate tissue damage.
Oxidative stress is clearly a critical factor in the adverse outcomes of STEMI patients. Although its role in the initiation and progression of STEMI has been well documented, further exploration of other mechanisms contributing to oxidative stress is needed to develop innovative treatments and improve patient outcomes. Current literature on gene expression related to oxidative stress is sparse and often contradictory.5,6 This study aims to identify differentially expressed oxidative stress-related genes (DEOSRGs) in STEMI patients and examine the relationship between gene expression levels and clinical outcomes.
Materials and Methods
Selection of the Expression Profile Dataset
We assessed datasets based on specific inclusion criteria: datasets containing whole peripheral blood STEMI messenger RNA (mRNA) expression levels, discovery datasets featuring mRNA expression levels from STEMI patients who had undergone primary PCI along with samples from healthy controls.
Datasets were excluded if RNA sequencing was used for detection instead of microarray; they included patients who did not receive primary PCI treatment; and they contained data on cell types such as circulating endothelial cells, platelets or nucleated cells that were not present in plasma.
From the 18 datasets initially considered, six were identified as the most prevalent sample type for whole transcriptome analysis of plasma, namely GSE29111, GSE60993, GSE61144, GSE34198, GSE49925 and GSE34571. Because of factors such as the absence of a healthy control group, unclear timing of blood collection, or lack of patients receiving PCI, datasets GSE29111, GSE34198 and GSE34571 were excluded. Three datasets (GSE49925, GSE60993 and GSE61144) were selected for further analysis.
To identify DEOSRGs between healthy controls and STEMI patients, two smaller datasets [GSE60993: healthy controls (seven samples) versus STEMI (seven samples)] and [GSE61144: healthy controls (ten samples) versus STEMI (seven samples)] were chosen as training sets. The GSE49925 dataset served as a validation set for comparisons between STEMI patients and healthy controls. This study’s verification used 61 samples from STEMI patients and 93 samples from healthy controls.
Identification of DEOSRGs
The raw microarray data from the GSE61144 and GSE60993 datasets were processed using the online tool GEO2R2, which is based on the R package ‘limma’, to identify serum differential expression genes (DEGs) between healthy controls and STEMI patients. DEGs were selected according to the criteria of |log2(Fold Change)| >1 and a false discovery rate (FDR) <0.05. The ‘GOBP_RESPONSE_TO_OXIDATIVE_STRESS’ gene set, which includes 436 oxidative stress-related genes, was sourced from MSigDB (https://www.gsea-msigdb.org/gsea/msigdb).
We used the R packages ‘ggplot2’ and ‘VennDiagram’ to create a Venn diagram to visualise the overlap of DEGs among the three discovery datasets. Furthermore, the expression levels of these DEOSRGs were depicted using volcano plots and difference ranking plots. The correlation of these DEOSRGs was evaluated using Spearman correlation and visualised with heatmaps in patients with varying prognoses through the R package ‘ggplot2’.
Functional and Pathway Enrichment Analysis
Functional enrichment analysis was conducted to determine the biological functions of these DEOSRGs, including Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) using the R package ‘clusterProfiler’. The enrichment of GO terms and KEGG pathways were based on the criteria of FDR <0.05, followed by visualisation of the top three most significant GO terms and KEGG pathways using the R packages ‘ggplot2,’ ‘igraph’ and ‘ggraph’.
Construction of a Signature Integrating DEOSRGs
An optimal model integrating DEOSRGs and clinical indicators was developed in the GSE49925 dataset using the Least Absolute Shrinkage and Selection Operator (LASSO) penalised Cox proportional hazards regression via the R packages ‘survival’ and ‘glm-net’. With a cutoff value of p<0.1, the prognosis-related clinical parameters were identified.
Thus, the risk score for each STEMI patient was determined using the following formula:
Risk score = [Expression level of parameter 1 × coefficient] + [Expression level of parameter 2 × coefficient] +… + [Expression level of parameter n × coefficient].
Patients in the GSE49925 dataset were then classified into low- and high-risk groups based on the median risk score.
Clinical Utility Evaluation of the Signature
To validate the risk stratification performance of the signature, a risk score distribution plot and a Kaplan–Meier survival curve were employed to compare survival between low- and high-risk groups using the R package ‘survival’. Additionally, a time-dependent receiver operating characteristic (ROC) curve analysis was performed, including 1- and 2-year survival, to demonstrate the sensitivity and specificity of the signature with the R package ‘survivalROC’. Decision curve analysis (DCA) was also conducted to evaluate the clinical utility (net benefit) of the index, considering both accuracy and the trade-off between false positives and false negatives. This part of the analysis and result visualisation was conducted using the R package ‘survival’ and the stdca.R script.
Construction of the Prognostic Nomogram
A nomogram was created based on the model to provide a quantitative analysis tool for predicting the survival risk of STEMI patients. Each clinical variable was assigned a score and the total score was calculated by summing the scores across all the variables. Calibration curves were drawn to compare predicted and actual survival and evaluate the predictive performance of nomograms. The nomogram and calibration curves were plotted using the R package ‘RMS’.
External Validation
A cohort of 92 STEMI patients admitted to our centres was used for external validation of the model. The inclusion criteria were as follows: age =18 years; STEMI diagnosis based on European Society of Cardiology guidelines; symptom onset within 12 hours before PCI; and informed consent obtained.7 Exclusion criteria included cardiogenic shock, history of MI or coronary artery bypass grafting, severe hepatic or renal dysfunction, cancer or other terminal illnesses and intolerance to antiplatelet or anticoagulant therapy.
Blood samples were collected from the antecubital vein before PCI and stored in ethylenediaminetetraacetic acid tubes at -80°C for analysis. Total RNA was extracted from plasma samples using the miRNeasy Mini Kit (Qiagen) following the manufacturer’s protocol. Reverse transcription was performed using the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems) with random primers. Relative expression levels of target genes were calculated using the 2^-??Ct method, where ?Ct = Ct(target gene) - Ct(GAPDH) and ??Ct = ? Ct(sample) - ?Ct(calibrator). The calibrator was a pooled plasma RNA sample from healthy controls. Patients were categorised into high-risk and low-risk groups based on the median risk score. Prospective follow-up was conducted to record major adverse cardiovascular events (MACE) over a median duration of 426 days, including cardiac death, MI recurrence, malignant arrhythmia, heart failure hospitalisation and repeat revascularisation. Kaplan–Meier survival curves and risk distribution plots were used to assess the prognostic value of the model. The study protocol adhered to the Declaration of Helsinki and received approval from the ethics committee of Tianjin Chest Hospital.
Statistical Analysis
This study used SPSS version 23.0 (SPSS) and R version 4.2.1 (http://www.R-project.org) for data analysis. Normally distributed quantitative data were represented as mean ± SD and compared using t-tests or t’ tests for two-group comparisons to evaluate baseline parameters.
Data that did not follow a normal distribution were reported as medians and interquartile ranges, with Mann–Whitney U-tests used for comparisons. Qualitative data were expressed as frequency and composition; Fisher’s exact test was applied to analyse differences in constituent ratios between groups. To identify clinical variables potentially affecting survival independently, both univariate and multivariate Cox proportional hazards regression models were employed. All p-values were two-sided; p<0.05 was considered statistically significant.
Results
Identification of DEOSRGs
Analysing 14 STEMI samples and 17 normal samples pooled from GSE60993 and GSE61144 led to identifying 103 DEGs. From these DEGs, we extracted 10 DEOSRGs (Figure 1A), comprising nine upregulated genes and one downregulated gene (Supplementary Table 1). Figures 1B and 1C highlight MMP9 as the top upregulated DEOSRG in the early stages of STEMI.
The most enriched terms in the categories of biological process, cellular component and molecular function were ‘response to oxidative stress’, ‘mitochondrial outer membrane’ and ‘NAD(P)+ nucleosidase activity’, respectively (Supplementary Table 2). Functional enrichment analysis identified ‘lipid and atherosclerosis’ as the most relevant signalling pathway to the DEOSRGs (Supplementary Figure 1A). The correspondence between DEOSRG and the GO term was visualised via a network in Supplementary Figure 1B.

Construction of a DEOSRG Prognostic Signature
According to the metadata, participants were followed for an average of 2.4 years to monitor cardiovascular death. Among the 61 STEMI patients, 55 survived and there were six cardiac deaths during the follow-up. Supplementary Table 3 compares the clinical parameters in STEMI patients with different outcomes in GES49225. A principal component analysis model was established for the DEOSRGs in the GSE49925 dataset, highlighting distinct separation between the groups (Figure 2A), with the first two principal components explaining 76.1% of the data variables. Additionally, Figure 2B presents a heatmap of expression correlation, revealing unique expression profiles of the DEOSRGs in groups with different prognoses. The expression correlation between the DEOSRGs in the death group was significantly weakened. A DEOSRG co-expression heatmap based on an individual level further confirmed significant expression correlation among DEOSRGs in the survival group (Figure 2C), which was significantly weakened in the death group (Figure 2D).

Prognosis-related parameters were identified to establish a predictive signature. Comparative analyses showed that MMP9, arginase 1 (ARG1) and interleukin 18 receptor accessory protein (IL18RAP) transcription levels were significantly higher in the death group compared with the survival group (Supplementary Figure 2). Subsequently, all available clinical indicators (e.g. BMI, blood pressure at admission, white blood cell count, Gensini score, lipid profiles, fasting blood glucose, serum creatinine) were introduced into a LASSO penalised Cox proportional hazards regression. Serum creatinine level at admission and Gensini score were identified as model candidates, with lambda.min of 0.00304 and 0.186, respectively. Multivariate Cox regression confirmed age as an additional factor included in the model, alongside the two clinical factors screened by LASSO regression (Supplementary Table 4).
The prerequisite for Cox regression application is that the independent variable satisfies the proportional risk hypothesis (p>0.05), indicating that the risk does not change over time. All included univariates and the model met this criterion (Supplementary Table 5). The oxidative stress-related gene-based prognostic index (OSRGPI) was established using expression data of key indicators multiplied by their Cox regression coefficients. The formula for calculating the risk score is:
Risk score = [Age × 0.31] + [Gensini score × 0.02] + [Serum creatinine × 1.1] + [Expression level of MMP9 × 2.52] + [Expression level of ARG1 × (-0.22)] + [Expression level of IL18RAP × (-0.51)] - 44.3.
Oxidative Stress-related Gene-based Prognostic Index Predicts Survival in STEMI Patients
To validate, the 61 STEMI patients from GSE49925 were divided into low- and high-risk groups based on the median risk score, with the model threshold set at 0.452. Figure 3A shows the risk score of the death group significantly higher than that of the survival group, indicating worse survival in high-risk patients compared with low-risk patients (Figure 3B). A Kaplan–Meier curve confirmed that patients in the low-risk group had significantly higher survival rates than those in the high-risk group during follow-up (Log-rank p=0.012) (Figure 3C). Time-dependent ROC curves were used to assess the model’s reliability, with an area under the curve (AUC) of 0.846 for 1-year survival and 0.816 for 2-year survival, indicating the model’s strong potential for short-term survival monitoring (Figure 3D).

DCA results also showed that the OSRGPI had greater net benefits than its components within the first 2 years post-STEMI onset, with the net benefit gradually diminishing thereafter (Figure 4A–C). Using the signature, we constructed a prognostic nomogram to provide a quantitative tool for predicting individual patients’ survival risk (Figure 4D). A prognosis calibration was performed to analyse the fit between the Cox regression model and the actual outcomes. Figure 4E shows good consistency between the predicted and actual 1- and 2-year survival in the STEMI cohort, although the predictive accuracy decreased significantly in the third year.

External Validation by Our Cohorts
We assessed the expression levels of three genes (MMP9, ARG1 and IL18RAP) in the patient cohort’s blood samples and calculated a risk score for each individual using our proprietary signature. Patients were stratified into low-risk and high-risk groups based on their median risk score. Both groups were comparable in terms of age, sex, smoking status, comorbidities, heart rate and blood pressure. However, significant differences were noted in DEOSRG gene expression levels and three clinical variables between the groups (Supplementary Table 6).
During the median follow-up period of 426 days, the low-risk group experienced only two adverse events: one case of recurrent MI and one hospitalisation for heart failure. Conversely, the high-risk group encountered eight adverse events, including one death, one recurrent MI, five hospitalisations for heart failure and malignant arrhythmia, and one additional revascularisation. The event rate was significantly higher in the high-risk group compared with the low-risk group (p=0.044).
As depicted in Figure 5A, the high-risk group exhibited a markedly higher incidence of MACE compared with the low-risk group. Figure 5B presents the Kaplan–Meier survival curves, indicating poorer survival outcomes for the high-risk group (HR 4.26), with the difference between groups’ statistical significance (Log-rank p=0.046).
Discussion
Several studies examine the association between STEMI and oxidative stress, shedding light on the underlying mechanisms.8–10 However, contemporary literature on DEOSRGs in STEMI is scarce. Therefore, our study aimed to identify such DEOSRGs and provide insights into the mechanisms of oxidative stress in STEMI, offering potential therapeutic targets.
Oxidative stress plays a crucial role in the pathogenesis of STEMI, contributing to myocardial ischaemia and cardiomyocyte death.4 In our study, we identified several DEOSRGs associated with oxidative stress in STEMI. Several DEOSRGs were significantly altered in STEMI patients compared with those of the healthy control group. Interestingly, MMP9 and UCP2 were among the top upregulated and downregulated DEOSRGs, respectively, in the early stages of STEMI.
Functional enrichment analysis revealed that the DEOSRGs were significantly enriched in pathways related to lipid metabolism and atherosclerosis, as well as mitochondrial function. This suggests that oxidative stress may contribute to the development and progression of atherosclerotic plaques and other processes in the pathophysiology of STEMI, particularly in patients who have experienced ischaemia-reperfusion. Further exploration is needed to investigate the potential links between the over/under-expression of the identified DEOSRGs and their correlation with outcomes for STEMI patients.
Among the upregulated DEOSRGs, MMP9 was the most prominently expressed. MMP9 plays a vital role in the degradation of the extracellular matrix (ECM) and is crucial for tissue remodelling.11 In the context of STEMI, MMP9 upregulation signifies enhanced tissue damage through the breakdown of ECM components, such as collagen.12,13 Studies have demonstrated that increased MMP9 expression correlates with the infiltration of neutrophils and macrophages into infarcted myocardial tissue, indicating its involvement in the inflammatory response following STEMI.14,15 The MMP family is closely related to the severity and prognosis of acute MI, with predictive value for MACE.13 Furthermore, MMP9 expression is associated with plaque instability and rupture, common precipitating factors in MI.16
Another key upregulated DEOSRG is ARG1, which participates in the urea cycle and is significantly elevated during the early stages of MI due to the degranulation of neutrophils and macrophages.15 The role of ARG1 role in the inflammatory response and modulation of endothelial function in STEMI is critical. By competing with nitric oxide synthase for arginine, ARG1 reduces the availability of arginine for nitric oxide synthesis, leading to disrupted endothelial stability, increased oxidative stress, impaired vasodilation and elevated vascular tone.17–19 Research has indicated that increased arginase expression is linked to the development of atherosclerosis, and studies have shown that ARG1 upregulation reduces macrophage infiltration and inflammation in atherosclerotic plaques.20,21
IL18RAP, another upregulated DEOSRG, functions as an accessory protein for the interleukin (IL)-18 signalling pathway within the nuclear factor-kB pathway, which is associated with proinflammatory processes.22 The upregulation of IL18RAP indicates increased IL-18 activity, leading to heightened inflammation and oxidative stress. IL-18 is involved in various inflammatory mechanisms contributing to atherosclerosis development and plaque instability, further impacting myocardial health. A recent study showed that a single nucleotide polymorphism (SNP) in IL18RAP (rs917997) is associated with the risk of MI and increased serum IL-18 levels.23 This SNP’s minor allele is linked to multifocal atherosclerosis and arterial hypertension following MI, suggesting a close association with poor prognosis after MI.
In this study, we observed a significant downregulation of the UCP2 gene, which regulates mitochondrial membrane potential and reduces oxidative stress.24 The downregulation of UCP2 results in an accumulation of ROS by disrupting the proton gradient across the inner mitochondrial membrane, subsequently decreasing antioxidant defence and increasing oxidative stress, which negatively impacts the myocardium.25,26 Interestingly, increased UCP2 expression in patients with comorbidities such as diabetes and hypertension has been associated with preventing the progression of atherosclerosis.27 Despite its critical role in oxidative stress regulation, UCP2 did not exhibit significant prognostic value for patient outcomes and was not included in the final predictive model. This exclusion might be because of the complex interplay between UCP2 expression and various clinical factors, suggesting that, while UCP2 is involved in the early stages of STEMI, it may not be a reliable standalone predictor for long-term prognosis.
A vital element of this study is the practical application of our predictive signature. By combining differentially expressed DEOSRGs with readily accessible clinical factors, we have developed a comprehensive model for stratifying STEMI patients and predicting their survival risk. This hybrid approach provides several benefits over conventional prognostic tools. Traditional risk stratification methods, such as the Thrombolysis in Myocardial Infarction risk score and the Global Registry of Acute Coronary Events risk model, largely depend on clinical predictors and extensive patient outcome databases.28,29 While these tools are beneficial, they do not include gene expression data, which could provide more profound insights into the molecular mechanisms underlying STEMI. Our model bridges this gap by integrating genetic and clinical information, potentially improving the precision and specificity of risk prediction.
We demonstrated the robustness of our predictive signature through multifaceted validation in both internal and external cohorts. Throughout the development phase, various analyses, such as risk factor distribution plots, Kaplan–Meier survival analysis, time-dependent ROC curves, calibration curves and decision curve analysis, showed that our model effectively categorises patients into low- and high-risk groups within the first 2 years of follow-up, with significant differences in survival outcomes. This suggests that our model could be a valuable supplement to the existing prognostic tools used in clinical settings.
In the external validation cohort, the relative expression levels of the three key genes remained significantly distinct between the low-risk and high-risk groups, aligning with our previous bioinformatics analysis. However, it is essential to recognise that gene transcription levels measured by microarray and reverse transcription-polymerase chain reaction (RT-PCR) are inherently different types of data, each with its own range and units. To account for the inherent differences between microarray and RT-PCR measurements, we recalculated individual risk scores using RT-PCR data from our external validation cohort and defined a new risk stratification threshold based on the cohort’s median score. Our findings confirmed that the model could effectively differentiate patients with varying prognostic risk levels.
Although the Kaplan–Meier survival curve showed a statistically significant difference in the incidence of MACE between the two groups (p=0.046), the significance was not very strong, which might be attributed to the limited sample size of the validation cohort. The event times for individual patients can significantly impact the overall survival curve, particularly with a small sample size. Therefore, further refinement of the threshold selection for grouping and larger sample sizes with extended follow-up periods is necessary for future validation. Moreover, incorporating DEOSRGs into the prognostic model provides an innovative approach to understanding the pathophysiology of STEMI. By identifying genes associated with ferroptosis, a type of regulated cell death, our model offers insights into potential therapeutic targets and personalised treatment strategies.
Despite the promising predictive performance and clinical utility of the signature for patient stratification and survival risk prediction, several limitations should be acknowledged. The small sample size and potential selection bias may limit the generalisability of our findings. Additionally, the use of healthy individuals as a control group restricts the specificity of the results, as comparisons with patients having other coronary syndromes were not feasible due to dataset limitations. Moreover, while the identified DEOSRGs may reflect ongoing ischaemia/reperfusion injury in STEMI patients, our study did not assess their relationship with microvascular obstruction following primary PCI, thus not providing a comprehensive understanding of their pathophysiological implications. Future research should aim to address these limitations by incorporating larger, more diverse patient cohorts and exploring the mechanistic roles of DEOSRGs in different coronary syndromes.
Conclusion
Our analysis revealed notable changes in the expression levels of several DEOSRGs in STEMI patients compared with healthy controls. Functional enrichment analysis highlighted the involvement of these genes in pathways related to lipid metabolism, atherosclerosis, mitochondrial function and the oxidative stress response. We developed a prognostic signature that efficiently stratifies STEMI patients based on their projected survival outcomes. These insights provide valuable insights into the role of oxidative stress in STEMI and suggest potential therapeutic targets.
However, the limitations of the study must be acknowledged. The small sample size and potential selection bias may impact the generalisability of our results. Additionally, using healthy individuals as the control group may limit the specificity of our findings. Future research should include larger, more diverse cohorts and investigate the relationship between DEOSRGs and microvascular obstruction following PCI. This continued research is essential to confirm our findings and further elucidate the functional importance of the identified DEOSRGs in STEMI.
Clinical Perspective
- This study aimed to identify differentially expressed oxidative stress-related genes (DEOSRGs) in ST-elevation MI patients.
- The study systematically reviewed Gene Expression Omnibus datasets (GSE49925, GSE60993, GSE61144) and identified overlapping DEOSRGs using GEO2R2.
- DEOSRGs were analysed to understand their biological roles through functional enrichment.
- For model construction and validation, an optimal prognostic model was developed using Least Absolute Shrinkage and Selection Operator penalised Cox regression and validated through survival, receiver operating characteristic curve and decision curve analyses.
- A prognostic signature including three upregulated DEOSRGs (matrix metallopeptidase-9, arginase 1, interleukin-18 receptor accessory protein) and clinical variables (age, serum creatinine level, Gensini score) was formulated.
- The signature demonstrated robust predictive performance and clinical utility for stratifying patients into risk groups, validated with external plasma samples.