PRERISK
(2024)Objective
To develop and validate statistical and machine-learning models (PRERISK) to predict individual risk of stroke recurrence among stroke survivors using population-based healthcare data.
Study Summary
• Key predictors included time since prior stroke, AF, dyslipidemia, age, diabetes, sex
Intervention
Population-based cohort analysis using supervised machine-learning (Random Forest, AdaBoost, XGBoost) and Cox regression to predict early (≤90 d), late (91–365 d), and long-term (>365 d) stroke recurrence from routinely collected clinical, laboratory, pharmacy, and socioeconomic data.
Inclusion Criteria
• First-ever ischemic stroke (IS) or intracerebral hemorrhage (ICH) identified from ICD-9/10 codes (2014–2020)
• Survived ≥7 days after index stroke
• Data available in Catalonia public healthcare databases
Outcome
• AUROC (Cox): 0.73 (0.72–0.75) early; 0.59 (0.57–0.61) late; 0.67 (0.66–0.70) long-term
• Recurrence proportion: 16.21% (5,932/36,114); median follow-up 2.69 years
Bottom Line
In a large, population-based cohort, PRERISK machine-learning models achieved AUROC 0.76 (early), 0.60 (late), and 0.71 (long-term) and outperformed Cox regression; a simplified model using key predictors had similar performance.
Major Points
- Population-based dataset: 41,975 stroke admissions from 88 public health centers (Catalonia, 2014–2020); analysis cohort 36,118 first-ever IS/ICH cases; 16.21% (5,932/36,114) had recurrence
- Outcomes predicted at three windows: early (≤90 days), late (91–365 days), long-term (>365 days)
- Model performance (ML AUROC): 0.76 (95% CI 0.74–0.77) early; 0.60 (0.58–0.61) late; 0.71 (0.69–0.72) long-term
- Comparator performance (Cox AUROC): 0.73 (0.72–0.75) early; 0.59 (0.57–0.61) late; 0.67 (0.66–0.70) long-term
- Key predictors: time since previous stroke, Barthel Index, atrial fibrillation, dyslipidemia, age, diabetes, sex; simplified model with modifiable risk factors showed similar accuracy
- Median follow-up was 2.69 years
Study Design
- Study Type
- Population-based cohort analysis with statistical (Cox) and supervised machine-learning models
- Randomization
- No
- Sample Size
- 36114
- Follow-up
- Median 2.69 years
- Centers
- 88
- Countries
- Spain
Primary Outcome
Definition: Discrimination (AUROC) for prediction of stroke recurrence at early (≤90 d), late (91–365 d), and long-term (>365 d) windows
| Control | Intervention | HR/OR | P-value |
|---|---|---|---|
| Cox AUROC: 0.73 (0.72–0.75); 0.59 (0.57–0.61); 0.67 (0.66–0.70) | ML AUROC: 0.76 (0.74–0.77); 0.60 (0.58–0.61); 0.71 (0.69–0.72) | - |
Limitations & Criticisms
- Observational design using administrative/registry data susceptible to coding and selection bias
- Late-window performance (AUROC 0.60) indicates modest discriminative ability
- Generalizability may be limited to similar healthcare systems and data availability
Citation
Stroke. 2024;55:1200–1209. DOI: 10.1161/STROKEAHA.123.043691