Inferring disease severity in rheumatoid arthritis using predictive modeling in administrative claims databases

Autoři: Urmila Chandran aff001;  Jenna Reps aff001;  Paul E. Stang aff001;  Patrick B. Ryan aff001
Působiště autorů: Janssen Research and Development, Titusville, New Jersey, United States of America aff001
Vyšlo v časopise: PLoS ONE 14(12)
Kategorie: Research Article



Confounding by disease severity is an issue in pharmacoepidemiology studies of rheumatoid arthritis (RA), due to channeling of sicker patients to certain therapies. To address the issue of limited clinical data for confounder adjustment, a patient-level prediction model to differentiate between patients prescribed and not prescribed advanced therapies was developed as a surrogate for disease severity, using all available data from a US claims database.


Data from adult RA patients were used to build regularized logistic regression models to predict current and future disease severity using a biologic or tofacitinib prescription claim as a surrogate for moderate-to-severe disease. Model discrimination was assessed using the area under the receiver (AUC) operating characteristic curve, tested and trained in Optum Clinformatics® Extended DataMart (Optum) and additionally validated in three external IBM MarketScan® databases. The model was further validated in the Optum database across a range of patient cohorts.


In the Optum database (n = 68,608), the AUC for discriminating RA patients with a prescription claim for a biologic or tofacitinib versus those without in the 90 days following index diagnosis was 0.80. Model AUCs were 0.77 in IBM CCAE (n = 75,579) and IBM MDCD (n = 7,537) and 0.75 in IBM MDCR (n = 36,090). There was little change in the prediction model assessing discrimination 730 days following index diagnosis (prediction model AUC in Optum was 0.79).


A prediction model demonstrated good discrimination across multiple claims databases to identify RA patients with a prescription claim for advanced therapies during different time-at-risk periods as proxy for current and future moderate-to-severe disease. This work provides a robust model-derived risk score that can be used as a potential covariate and proxy measure to adjust for confounding by severity in multivariable models in the RA population. An R package to develop the prediction model and risk score are available in an open source platform for researchers.

Klíčová slova:

Database and informatics methods – Drug administration – Exercise therapy – Health insurance – Immunology – Medicare – Rheumatoid arthritis


1. Bernatsky S, Lix L, O'Donnell S, Lacaille D, Canrad Network. Consensus statements for the use of administrative health data in rheumatic disease research and surveillance. J Rheumatol. 2013; 40(1): 66–73. doi: 10.3899/jrheum.120835 23118109

2. Albrecht K, Zink A. Poor prognostic factors guiding treatment decisions in rheumatoid arthritis patients: a review of data from randomized clinical trials and cohort studies. Arthritis Research & Therapy. 2017; 19: 68.

3. Avina-Zubieta JA., Abrahamowicz M, Cho HK, Rahman MM, Sylvestre MP, Esdaile JM, et al. Immediate and past cumulative effects of oral glucocorticoids on the risk of acute myocardial infarction in rheumatoid arthritis: a population-based study. Rheumatology (Oxford). 2013; 52(1): 68–75.

4. McBride S., Sarsour K, White LA, Nelson DR, Chawla AJ, Johnston JA, Biologic disease-modifying drug treatment patterns and associated costs for patients with rheumatoid Arthritis. J Rheumatol. 2011; 38(10): 2141–9. doi: 10.3899/jrheum.101195 21844154

5. Ravi B, Croxford R, Austin PC, Hollands S, Paterson JM, Bogoch al.Increased surgeon experience with rheumatoid arthritis reduces the risk of complications following total joint arthroplasty. Arthritis Rheumatol. 2014; 66(3): 488–96. doi: 10.1002/art.38205 24574207

6. Widdifield J, Bernatsky S, Paterson JM, Gunraj N, Thorne JC, Pope J. et al. Serious infections in a population-based cohort of 86,039 seniors with rheumatoid arthritis. Arthritis Care Res (Hoboken). 2013; 65(3): 353–61.

7. Suchard MA, Simpson SE, Zorych I, Ryan P, Madigan D. Massive parallelization of serial inference algorithms for a complex generalized linear model. ACM Trans Model Comput Simul. 2013;23(1). doi: 10.1145/2414416.2414791 25328363

8. Singh JA, Saag KG, Bridges SL Jr., Akl EA, Bannuru RR, Sullivan MC, et al. 2015 American College of Rheumatology Guideline for the Treatment of Rheumatoid Arthritis. Arthritis Care Res. 2016;68(1):1–25.

9. Smolen JS, Landewe R, Bijlsma J, Burmester G, Chatzidionysiou K, Dougados M, et al. EULAR recommendations for the management of rheumatoid arthritis with synthetic and biological disease-modifying antirheumatic drugs: 2016 update. Ann of Rheum Dis. 2017;76(6):960–77.

10. Reps JM, Schuemie MJ, Suchard MA, Ryan PB, Rijnbeek PR. Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data. J Am Med Inform Assoc. 2018; 25(8):969–75. doi: 10.1093/jamia/ocy032 29718407

11. Overhage JM, Ryan PB, Reich CG, Hartzema AG, Stang PE. Validation of a common data model for active safety surveillance research. J Am Med Inform Assoc. 2012;19(1):54–60. doi: 10.1136/amiajnl-2011-000376 22037893

12. Voss EA, Makadia R, Matcho A, Ma Q, Knoll C, Schuemie M, et al. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases. J Am Med Inform Assoc. 2015;22(3):553–64. doi: 10.1093/jamia/ocu023 25670757

13. Chung CP, Rohan P, Krishnaswami S, McPheeters ML. A systematic review of validated methods for identifying patients with rheumatoid arthritis using administrative or claims data. Vaccine. 2013;31 Suppl 10:K41–61.

14. Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, et al. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers. Stud Health Tech Inform. 2015;216:574–8.

15. PatientLevelPrediction R package. Available from:

16. Desai RJ, Rao JK, Hansen RA, Fang G, Maciejewski ML, Farley JF. Predictors of treatment initiation with tumor necrosis factor-alpha inhibitors in patients with rheumatoid arthritis. J Manag Care Spec Pharm. 2014;20(11):1110–20. doi: 10.18553/jmcp.2014.20.11.1110 25351972

17. Desai RJ, Solomon DH, Weinblatt ME, Shadick N, Kim SC. An external validation study reporting poor correlation between the claims-based index for rheumatoid arthritis severity and the disease activity score. Arthritis Res Ther. 2015;17:83. doi: 10.1186/s13075-015-0599-0 25880932

18. Sato M, Schneeweiss S, Scranton R, Katz JN, Weinblatt ME, Avorn J, et al. The validity of a rheumatoid arthritis medical records-based index of severity compared with the DAS28. Arthritis Res Ther. 2006;8(3):R57. doi: 10.1186/ar1921 16542499

19. Ting G, Schneeweiss S, Scranton R, Katz JN, Weinblatt ME, Young M, et al. Development of a health care utilisation data-based index for rheumatoid arthritis severity: a preliminary study. Arthritis research & therapy. 2008;10(4):R95.

20. Wolfe F, Michaud K, Simon T. Can severity be predicted by treatment variables in rheumatoid arthritis administrative data bases? J of Rheumatol. 2006;33(10):1952–6.

21. Norgeot B, Glicksberg BS, Trupin L, Lituiev D, Gianfrancesco M, Oskotsky B, et al. Assessment of a Deep Learning Model Based on Electronic Health Record Data to Forecast Clinical Outcomes in Patients With Rheumatoid Arthritis. JAMA Netw Open. 2019;2(3):e190606. doi: 10.1001/jamanetworkopen.2019.0606 30874779

Článek vyšel v časopise


2019 Číslo 12
Nejčtenější tento týden