Predicting atrial fibrillation in primary care using machine learning

Autoři: Nathan R. Hill aff001;  Daniel Ayoubkhani aff002;  Phil McEwan aff002;  Daniel M. Sugrue aff002;  Usman Farooqui aff001;  Steven Lister aff001;  Matthew Lumley aff003;  Ameet Bakhai aff004;  Alexander T. Cohen aff005;  Mark O’Neill aff006;  David Clifton aff007;  Jason Gordon aff002
Působiště autorů: Bristol-Myers Squibb Pharmaceutical Ltd, Uxbridge, United Kingdom aff001;  Health Economics and Outcomes Research Ltd, Cardiff, United Kingdom aff002;  Pfizer Ltd, Surrey, United Kingdom aff003;  Department of Cardiology, Royal Free Hospital, London, United Kingdom aff004;  Department of Haematological Medicine, Guys and St Thomas' NHS Foundation Trust, King's College London, London, United Kingdom aff005;  Division of Cardiovascular Medicine, Guys and St Thomas' NHS Foundation Trust, King's College London, London, United Kingdom aff006;  Department of Engineering Science, University of Oxford, Oxford, United Kingdom aff007
Vyšlo v časopise: PLoS ONE 14(11)
Kategorie: Research Article
doi: 10.1371/journal.pone.0224582



Atrial fibrillation (AF) is the most common sustained heart arrhythmia. However, as many cases are asymptomatic, a large proportion of patients remain undiagnosed until serious complications arise. Efficient, cost-effective detection of the undiagnosed may be supported by risk-prediction models relating patient factors to AF risk. However, there exists a need for an implementable risk model that is contemporaneous and informed by routinely collected patient data, reflecting the real-world pathology of AF.


This study sought to develop and evaluate novel and conventional statistical and machine learning models for risk-predication of AF. This was a retrospective, cohort study of adults (aged ≥30 years) without a history of AF, listed on the Clinical Practice Research Datalink, from January 2006 to December 2016. Models evaluated included published risk models (Framingham, ARIC, CHARGE-AF), machine learning models, which evaluated baseline and time-updated information (neural network, LASSO, random forests, support vector machines), and Cox regression.


Analysis of 2,994,837 individuals (3.2% AF) identified time-varying neural networks as the optimal model achieving an AUROC of 0.827 vs. 0.725, with number needed to screen of 9 vs. 13 patients at 75% sensitivity, when compared with the best existing model CHARGE-AF. The optimal model confirmed known baseline risk factors (age, previous cardiovascular disease, antihypertensive medication usage) and identified additional time-varying predictors (proximity of cardiovascular events, body mass index (both levels and changes), pulse pressure, and the frequency of blood pressure measurements).


The optimal time-varying machine learning model exhibited greater predictive performance than existing AF risk models and reflected known and new patient risk factors for AF.

Klíčová slova:

Atrial fibrillation – Blood pressure – Heart failure – Hypertension – Machine learning – Medical risk factors – Neural networks – Primary care


Článek vyšel v časopise


2019 Číslo 11