Identifying early pulmonary arterial hypertension biomarkers in systemic sclerosis: Machine learning on proteomics from the DETECT cohort.
Pulmonary arterial hypertension (PAH) is a devastating complication of Systemic Sclerosis (SSc). Screening for PAH in SSc has increased detection, allowed early treatment for PAH, and improved patient outcomes. Blood-based biomarkers that reliably identify SSc patients at risk of PAH, or with early disease, would significantly improve screening, potentially leading to improved survival, and provide novel mechanistic insights into early disease. The main objective of this study was to identify a proteomic biomarker signature that could discriminate SSc patients with, and without PAH using a Machine Learning approach, and to validate the findings in an external cohort.Serum samples from patients with SSc and PAH (n=77) and SSc without PH (non-PH, n=80) were randomly selected from the clinical DETECT study and underwent proteomic screening using the MYRIAD RBM discovery platform consisting of 313 proteins. Samples from an independent validation cohort (SSc-PAH, n=22 and non-PH, n=22) were obtained from University of Sheffield, UK. Random Forest (RF) analysis identified a novel panel of eight proteins, comprising Collagen IV, Endostatin, IGFBP-2, IGFBP-7, MMP-2, Neuropilin-1, NT-proBNP and RAGE, that discriminated PAH from non-PH in SSc patients in the DETECT discovery cohort (average area under the ROC values (ROC-AUC) of 0.741, 65.1% sensitivity / 69.0% specificity) was reproduced in the Sheffield cohort (81.1% accuracy, 77.3% sensitivity / 86.5% specificity). This novel 8-protein biomarker panel has the potential to improve early detection of PAH in SSc patients and may provide novel insights into the pathogenesis of PAH in the context of SSc.