Characterizing the clinical and genetic spectrum of polycystic ovary syndrome in electronic health records.
Polycystic ovary syndrome (PCOS) is one of the leading causes of infertility, yet current diagnostic criteria are ineffective at identifying patients whose symptoms reside outside strict diagnostic criteria. As a result, PCOS is under diagnosed and its etiology is poorly understood.We aim to characterize the phenotypic spectrum of PCOS clinical features within and across racial and ethnic groups.We developed a strictly defined PCOS algorithm (PCOSkeyword-strict) using International Classification of Diseases, 9 th and 10 th edition (ICD9/10) and keywords mined from clinical notes in electronic health records (EHRs) data. We then systematically relaxed the inclusion criteria to evaluate the change in epidemiological and genetic associations resulting in three subsequent algorithms (PCOScoded-broad, PCOScoded-strict,PCOSkeyword-broad). We evaluated the performance of each phenotyping approach and characterized prominent clinical features observed in racially and ethnically diverse PCOS patients.The best performance came from the PCOScoded-strict algorithm with a positive predictive value (PPV) of 98%. Individuals classified as cases by this algorithm had significantly higher body mass index (BMI), insulin levels, free testosterone values, and genetic risk scores for PCOS, compared to controls. Median BMI was higher in African American females with PCOS compared to White and Hispanic females with PCOS.PCOS symptoms are observed across a severity spectrum that parallels the continuous genetic liability to PCOS in the general population. Racial and ethnic group differences exist in PCOS symptomology and metabolic health across different phenotyping strategies.