Characterization of Parkinson’s disease using blood-based biomarkers: A multicohort proteomic analysis

Authors: Marijan Posavi ^aff001; Maria Diaz-Ortiz ^aff001; Benjamine Liu ^aff001; Christine R. Swanson ^aff001; R. Tyler Skrinak ^aff001; Pilar Hernandez-Con ^aff001; Defne A. Amado ^aff001; Michelle Fullard ^aff001; Jacqueline Rick ^aff001; Andrew Siderowf ^aff001; Daniel Weintraub ^aff003; Leo McCluskey ^aff001; John Q. Trojanowski ^aff004; Richard B. Dewey, Jr ^aff005; Xuemei Huang ^aff006; Alice S. Chen-Plotkin ^aff001
Authors place of work: Department of Neurology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America ^aff001; National Institute of Neurological Disease and Stroke, National Institutes of Health, Bethesda, Maryland, United States of America ^aff002; Department of Psychiatry, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America ^aff003; Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America ^aff004; Department of Neurology and Neurotherapeutics, Clinical Center for Movement Disorders at the University of Texas Southwestern Medical Center, Dallas, Texas, United States of America ^aff005; Department of Neurology, Penn State College of Medicine, Hershey, Pennsylvania, United States of America ^aff006
Published in the journal: Characterization of Parkinson’s disease using blood-based biomarkers: A multicohort proteomic analysis. PLoS Med 16(10): e1002931. doi:10.1371/journal.pmed.1002931
Category: Research Article
doi: https://doi.org/10.1371/journal.pmed.1002931

Summary

Background

Parkinson’s disease (PD) is a progressive neurodegenerative disease affecting about 5 million people worldwide with no disease-modifying therapies. We sought blood-based biomarkers in order to provide molecular characterization of individuals with PD for diagnostic confirmation and prediction of progression.

Methods and findings

In 141 plasma samples (96 PD, 45 neurologically normal control [NC] individuals; 45.4% female, mean age 70.0 years) from a longitudinally followed Discovery Cohort based at the University of Pennsylvania (UPenn), we measured levels of 1,129 proteins using an aptamer-based platform. We modeled protein plasma concentration (log₁₀ of relative fluorescence units [RFUs]) as the effect of treatment group (PD versus NC), age at plasma collection, sex, and the levodopa equivalent daily dose (LEDD), deriving first-pass candidate protein biomarkers based on p-value for PD versus NC. These candidate proteins were then ranked by Stability Selection. We confirmed findings from our Discovery Cohort in a Replication Cohort of 317 individuals (215 PD, 102 NC; 47.9% female, mean age 66.7 years) from the multisite, longitudinally followed National Institute of Neurological Disorders and Stroke Parkinson’s Disease Biomarker Program (PDBP) Cohort. Analytical approach in the Replication Cohort mirrored the approach in the Discovery Cohort: each protein plasma concentration (log₁₀ of RFU) was modeled as the effect of group (PD versus NC), age at plasma collection, sex, clinical site, and batch. Of the top 10 proteins from the Discovery Cohort ranked by Stability Selection, four associations were replicated in the Replication Cohort. These blood-based biomarkers were bone sialoprotein (BSP, Discovery false discovery rate [FDR]-corrected p = 2.82 × 10⁻², Replication FDR-corrected p = 1.03 × 10⁻⁴), osteomodulin (OMD, Discovery FDR-corrected p = 2.14 × 10⁻², Replication FDR-corrected p = 9.14 × 10⁻⁵), aminoacylase-1 (ACY1, Discovery FDR-corrected p = 1.86 × 10⁻³, Replication FDR-corrected p = 2.18 × 10⁻²), and growth hormone receptor (GHR, Discovery FDR-corrected p = 3.49 × 10⁻⁴, Replication FDR-corrected p = 2.97 × 10⁻³). Measures of these proteins were not significantly affected by differences in sample handling, and they did not change comparing plasma samples from 10 PD participants sampled both on versus off dopaminergic medication. Plasma measures of OMD, ACY1, and GHR differed in PD versus NC but did not differ between individuals with amyotrophic lateral sclerosis (ALS, n = 59) versus NC. In the Discovery Cohort, individuals with baseline levels of GHR and ACY1 in the lowest tertile were more likely to progress to mild cognitive impairment (MCI) or dementia in Cox proportional hazards analyses adjusting for age, sex, and disease duration (hazard ratio [HR] 2.27 [95% CI 1.04–5.0, p = 0.04] for GHR, and HR 3.0 [95% CI 1.24–7.0, p = 0.014] for ACY1). GHR’s association with cognitive decline was confirmed in the Replication Cohort (HR 3.6 [95% CI 1.20–11.1, p = 0.02]). The main limitations of this study were its reliance on the aptamer-based platform for protein measurement and limited follow-up time available for some cohorts.

Conclusions

In this study, we found that the blood-based biomarkers BSP, OMD, ACY1, and GHR robustly associated with PD across multiple clinical sites. Our findings suggest that biomarkers based on a peripheral blood sample may be developed for both disease characterization and prediction of future disease progression in PD.

Keywords:

blood plasma – Cognitive impairment – dementia – biomarkers – Plasma proteins – Dopaminergics – Parkinson disease – Amyotrophic lateral sclerosis

Introduction

Parkinson’s disease (PD) is characterized by progressive loss of dopaminergic neurons in the substantia nigra, resulting in a clinical syndrome defined by bradykinesia, rigidity, tremor, and postural instability [1]. By the time a clinical diagnosis is made, 50% of nigral dopaminergic neurons may already be lost [2], suggesting a long prodromal phase during which intervention may be possible. Current medical practice for the diagnosis of PD relies almost entirely on clinical examination, with no laboratory-based testing available. Although a United States Food and Drug Administration (FDA)-approved, radioligand-based dopamine transporter imaging test (DaTSCAN) can confirm degeneration of dopaminergic neurons [3], time and expense have prevented its widespread adoption in clinical settings, and a positive result is not diagnostic for PD, because other degenerative neurologic diseases such as multiple systems atrophy exhibit similar findings. Moreover, even within PD, considerable heterogeneity in clinical presentation exists, with highly variable rates of both cognitive and motor progression over time [4]. At present, no clinical or research-based tests exist to predict PD disease progression, despite widespread recognition that such predictive tools are vital to the field [5]. Thus, the advent of blood-based markers to molecularly define PD individuals and to predict longitudinal progression in PD could transform clinical practice and the development of disease-modifying therapies.

To date, biomarker studies in PD have largely focused on candidate approaches, with an emphasis on protein measures obtained in the cerebrospinal fluid (CSF) [6,7], which is considerably more difficult to obtain than blood in a busy clinical setting. Although a handful of biomarkers nominated using these candidate approaches consistently differ comparing PD and control individuals (e.g., CSF measures of total alpha-synuclein [8]), individual marker effect sizes are small, and the scarcity of robust biomarkers limits the ability to develop multimarker panels for better discriminatory power. Moreover, the field largely lacks protein biomarkers that predict cognitive or motor progression across multiple cohorts [7]. Thus, we aimed to discover novel blood-based biomarkers for differentiation of individuals with PD from control individuals, as well as prediction of rate of PD progression. We approached this problem by screening >1,000 plasma proteins using an aptamer-based platform [9] in a discovery–replication design.

Methods

To identify biomarkers that might characterize individuals with PD, 1,129 (Discovery Cohort) and 1,305 (Parkinson’s Disease Biomarker Program [PDBP] Replication Cohort) plasma proteins were screened from 527 individuals, and their clinical data were analyzed (Fig 1). The clinical data and plasma samples were acquired from the University of Pennsylvania (UPenn) Udall Cohort (Discovery Cohort) and from a multisite PDBP Replication Cohort (Table 1). The study consisted of three major steps: (1) differentiation of PD from neurologically normal control (NC) participants (96 PD, 45 NC) in a single-site Discovery Cohort, using levels of 968 proteins that passed quality control (QC) metrics, obtained by an aptamer-based platform assay; (2) independent replication of the top biomarker candidates from step 1 in a multisite Replication Cohort (215 PD, 102 NC participants); and (3) prediction of PD progression using the biomarker candidates that replicated across the single-site Discovery Cohort and multisite Replication Cohort. These steps are summarized in Fig 1, parts of which were created with Biorender.com. The data were normalized, processed, and analyzed using the statistical software R [10]. This study is reported as per the Strengthening the Reporting of Observations Studies in Epidemiology (STROBE) guidelines (S1 STROBE Checklist). At study outset, the analysis plan (S1 Analysis Plan) was to flexibly investigate the Discovery Cohort and then perform an analysis mirroring the Discovery Cohort analysis in the Replication Cohort, with the two additional covariates of clinical site and batch, if site or batch effects were observed in the transition from a single-site/single-batch discovery phase to a multisite/multibatch replication phase. The ultimate analysis plan only differed from the predetermined plan in that the levodopa equivalent daily dose (LEDD) was not included as a covariate in the Replication Cohort when we found that not all PDBP PD participants had available LEDD data. Detailed methods are described below.

**Tab. 1. Demographic characteristics of study participants.**

Cohorts and sample collection

UPenn Udall Discovery Cohort

During the period between 2013 and 2015, blood plasma samples and clinical data were collected from 97 PD individuals and 45 NCs enrolled to participate in research approved by the UPenn Institutional Review Board (IRB). All PD individuals met the diagnostic criteria of the United Kingdom Parkinson’s Disease Brain Bank [9] and were part of a longitudinal, extensively characterized cohort at UPenn [11]. In order to control for environmental biases, we ensured that PD and control groups did not differ by age or sex, and NCs were recruited primarily from the unaffected spouses of PD individuals from the same clinic. Samples were acquired according to IRB-approved protocols as previously described by Chen-Plotkin and colleagues [12]. Written informed consent was obtained at study enrollment. One PD sample had outlier values in preprocessing and normalization steps on the aptamer-based assay and was excluded from further analyses. A total of 96/97 PD and all NC samples passed our QC criteria (see Preprocessing and QC of SOMAScan protein data for description of preprocessing and normalization) and were included in subsequent analyses.

Multisite PDBP Replication Cohort

Results were tested in a Replication Cohort in order to account for possible environmental and technical biases in our analysis. The Replication Cohort blood samples (collected in the period between 2013 and 2015) and clinical data were obtained from the PDBP Cohort [13], originating from research participants seen at two PDBP sites: Penn State University (Penn State, 100 PD and 78 NC) and the University of Texas Southwestern Medical Center (UTSW, 115 PD and 24 NC). One PD participant from UTSW and two NC participants from Penn State were excluded from analyses because of outlier measurements for a high proportion of SOMAScan proteins. Longitudinal follow-up with cognitive testing by the Montreal Cognitive Assessment (MoCA) was additionally obtained, and associations with biomarker levels analyzed, for UTSW PD participants. Each PDBP Center’s local IRB approved study protocols, and all participants were consented for the study.

Ethics statement for human participant research

For the Discovery Cohort, the IRB of UPenn approved the human participant research in this study. Written informed consent was obtained from Discovery Cohort participants. For the Replication Cohort, each PDBP Center’s local IRB approved study protocols, and all participants provided written informed consent for participation in PDBP. As one goal of the PDBP is to provide a biorepository of samples from a well-characterized set of individuals, participants consented to sharing of samples and deidentified data with investigators approved by the Biospecimen Review Access Committee at the time of enrollment.

Pooled reference samples

To investigate the effect of plasma handling on detected biomarker level, multiple identical aliquots of pooled plasma samples from the Discovery (UPenn reference pool) and Replication Cohorts (PDBP reference pool) were used. The UPenn reference pool was prepared by mixing samples from 450 PD and NC individuals. The PDBP reference pool included plasma samples from 13 PD and NC individuals. For these studies, one aliquot was assayed by SOMAScan directly, whereas an identical aliquot was subjected to incubation at room temperature for 30 minutes, followed by an extra freeze–thaw cycle, before assaying by SOMAScan.

BioFIND Cohort samples

The effect of levodopa therapy on plasma protein concentration was tested on 10 randomly selected PD individuals (5 females and 5 males above 50 years of age) from the BioFIND Cohort (Table 1) [14]. Plasma samples from the BioFIND Cohort were collected at two different visits: at baseline (when PD individuals had samples collected while taking their usual dopaminergic medication—i.e., ON medication) and 2 weeks after the baseline visit (when PD individuals had samples collected after an overnight washout of dopaminergic medication—i.e., OFF medication), as described by Kang and colleagues [14]. All study protocols and recruitment strategies for BioFIND were approved by the IRBs for the University of Rochester Clinical Trials Coordination Center (CTCC) and individual clinical sites.

Protein quantification

Samples from the Discovery and Replication Cohorts were assayed using the 1.1k and 1.3k Assay versions of the SOMAScan platform (Somalogic, Boulder, CO, USA [9]) in two separate runs, with operators blinded to disease status. This platform is based on protein-capture slow off-rate modified aptamers (SOMAmers), which are chemically modified oligonucleotides with specific affinity to recombinant protein targets, developed by in vitro selection (SELEX) as previously described [15,16].

The specific steps of the SOMAScan assay have been described in detail in prior publications [9,17,18], as well as technical white papers at www.somalogic.com. In brief, plasma samples were incubated with reagent mixes containing SOMAmers to allow for equilibrium binding of fluorophore-tagged aptamers to their protein targets. Next, a series of partitioning and washing steps were used to capture only SOMAmers that were bound to their cognate proteins. Finally, the protein-bound oligonucleotides were released from the protein complex, captured by complementarity, and quantified using DNA hybridization arrays.

To adjust for technical biases, the hybridization arrays were normalized and calibrated using data from a reference set of pooled plasma samples that was run with each batch. Raw Somalogic data in relative fluorescence units (RFUs) were log₁₀-transformed prior to analysis. A total of 142 plasma samples (97 PD and 45 NC) from the Discovery Cohort were assayed for 1,129 proteins (1.1k Assay), and 320 plasma samples (216 PD and 104 NC) from the Replication Cohort were assayed for 1,305 proteins (1.3k Assay). The Replication Cohort samples were assayed in batches of 85, distributed in five different plates, along with plasma reference pool samples, 59 amyotrophic lateral sclerosis (ALS) samples from the UPenn biorepository, and 20 samples from the BioFIND biorepository.

Preprocessing and QC of SOMAScan protein data

Plasma samples from the Discovery and Replication Cohorts were assayed in separate Somalogic runs. Discovery Cohort plasma samples were analyzed, along with 13 plasma calibrator samples, hybridization controls, and two buffer control samples. The reference pool samples (n = 8), Penn Udall ALS samples (n = 59), and BioFIND samples (n = 20 from 10 individuals with PD) were assayed along with plasma samples from the Replication Cohort.

Run QC standards were derived from metrics obtained during assay development, and preprocessing and normalization methods are described in detail in a technical white paper [19]. In brief, sample data were first normalized to eliminate hybridization artifacts, using “spiked in” hybridization controls. Median normalization was subsequently applied for each sample to remove other intraplate biases. For the SOMAScan assay, the hybridization control and median scale factors are expected to be in the range of 0.4–2.5 (±1.32 on log₂ scale). All samples had hybridization scale factors in the acceptable range, except one PD sample from the Discovery Cohort, which was excluded from downstream analyses.

In the next step, two QC criteria were implemented to filter SOMAScan protein data. The overall intraplate technical variability of SOMAScan assay was assessed by using three QC samples (identical aliquots from three different reference sample pools) run in triplicate (for a total of nine samples). These three sets of triplicates were placed randomly within the batches of biological samples in order to capture intraplate variability. The coefficients of variation (CVs) from these three sets of triplicate QC samples were calculated (i.e., for each protein, three CVs were calculated) using the raw Somalogic data (in RFUs). Proteins showing CVs greater than 0.2 from any one of the triplicates were excluded from downstream analyses. There were 36 and 33 proteins in the Discovery and Replication Cohorts, respectively, with CV > 0.2 in at least one of three runs (S1 Fig). The second filter, which involved removing proteins with >25% measurements outside of the lower limit of detection (LLOD) or upper limit of detection (ULOD), was applied to the Discovery Cohort only, as limits of detection were not provided for subsequent versions of the SOMAScan. This resulted in elimination of an additional 125 proteins (S1A Fig), leaving a total of 968 Discovery Cohort proteins for downstream analyses.

Data processing and cross-sectional statistical analyses

Nomination of proteins that differed in PD versus NC group

To detect proteins whose plasma concentration associated significantly with disease category (PD versus NC), multiple linear regression models were employed. In the Discovery Cohort, each protein plasma concentration (log₁₀ of RFU) was modeled as the effect of treatment group (PD versus NC), age at plasma collection, sex, and the LEDD. A total of 140 candidate biomarkers with p-value of group effect < 0.005 (PD versus NC) were nominated (S1 Table) for downstream analyses (hierarchical clustering and Stability Selection) from the Discovery Cohort. False discovery rate (FDR)-corrected p-values were also derived for these biomarkers using the Benjamini-Hochberg method [20].

Hierarchical clustering and heatmap generation

A heatmap was generated using the function heatmap.2 from the R package gplots [21]. Raw Somalogic data (RFUs) were log-transformed and then centered and scaled. Both participants and proteins were hierarchically clustered by euclidean distance and average linkage using the hclust function [10].

Stability selection ranking

We performed Stability Selection [22] (variable selection based on subsampling in combination with least absolute shrinkage and selection operator [LASSO] [23]) on Discovery Cohort data (96 PD, 45 NC). To rank candidate biomarkers, the R BioMark package across 100,000 jackknifed iterations [24] was employed. At each iteration, 30% of the proteins and 10% of the samples were left out of the bag, and LASSO was used to feature-select for variables on the remaining data. The proportion of iterations in which LASSO reported a nonzero coefficient was used to rank the proteins, generating a list of the top 10 proteins for evaluation in the Replication Cohort.

Replication Cohort analyses

In the Replication Cohort, each protein plasma concentration (log₁₀ of RFU) was modeled as the effect of group (PD versus NC), age at plasma collection, sex, clinical site (UTSW versus Penn State), and batch (five plates). LEDD was not included as a covariate in the Replication Cohort when we found that not all PDBP PD participants had available LEDD data in the PDBP Data Management Resource. Analyses focused on the top 10 proteins from the Discovery Cohort, as ranked by Stability Selection; p-values were corrected for multiple hypothesis testing using the Benjamini-Hochberg method [20]. For the four validated proteins, a similar analysis was repeated including 59 ALS participants. Protein plasma concentration (log₁₀ of RFU) was modeled as the effect of disease group (NC, PD, or ALS), age at plasma collection, sex, and clinical site. Disease group coefficients were extracted, and p-values were adjusted using the Benjamini-Hochberg method.

Testing the effects of levodopa therapy

To test the effect of levodopa therapy on plasma protein levels (log₁₀ of RFU), a paired t test was applied to each of the proteins assayed. Nominal (unadjusted) paired t test p-values are presented in S5 Table. In addition, analyses were repeated, and p-values were obtained by paired permutation testing, which avoids assumptions of normality. Results were unchanged (S2 Fig).

Associations with cognition

Spearman’s rank-order correlation was calculated for baseline Mattis Dementia Rating Scale-2 (DRS) and biomarker (log₁₀ of RFU) levels using the R function cor.test [10]. p-Values were adjusted for FDR by the Benjamini-Hochberg method using the R function p.adjust [10].

PD progression analysis

Linear mixed-effects model analysis

Linear mixed-effects models were fitted to determine the effect of biomarker level on the rate of cognitive decline using the R package nlme [25]. In the Discovery Cohort, only participants with DRS scores measured within 6 months of the blood draw as well as at least one subsequent DRS score were included (n = 91), for an average follow-up period of 3.5 years. Our model incorporated DRS as the response variable, with age, sex, disease duration, baseline DRS, and the time-by-protein interaction as fixed effects and participant as a random effect. The same analysis was repeated adjusting for years of education as an additional fixed effect. The time-by-protein interaction coefficients were extracted, and the p-values for the interaction term were adjusted for FDR by the Benjamini-Hochberg method using the R function p.adjust [10].

Survival analysis

In both the Discovery and Replication Cohorts, individuals were divided into low-, medium-, or high-biomarker groups based on (log₁₀ of RFU) biomarker tertiles. For the Discovery Cohort, we extracted cognitive diagnoses based on clinical consensus diagnosis as previously described [11], and PD individuals with either a cognitive diagnosis of dementia at baseline or a diagnosis coded as normal following a baseline diagnosis of mild cognitive impairment (MCI) were excluded from the analysis, leaving 86 individuals for survival analysis. Events were defined as conversion from normal to MCI, normal to dementia, or MCI to dementia, for a total of 38 events. For the Replication Cohort, cognitive categorization was based on published MoCA norms (MoCA 26–30 = normal, MoCA 21–25 = MCI, and MoCA 20 or less = dementia) [26], and events were defined and participants with PD filtered in the same way as for the Discovery Cohort, for a total of 26 events observed in 74 individuals (followed for an average of 2.8 years). Survival analysis was carried out in R using the survival [27] package. Cox proportional hazards analyses were performed using the function coxph [27] to test whether biomarker tertile groups have an effect on the likelihood of an event, adjusting for age, sex, and disease duration. Analyses were repeated including years of education as an additional covariate. Results were visualized as Cox regression–adjusted curves or forest plots using the ggadjustedcurves and ggforest functions from the R survminer package [28]. Models with and without education were compared using the anova.coxph function from the R survival package [27].

Results

Discovery screen for plasma proteins differentiating PD from NC

From the original set of 1,129 proteins assayed in the single-site Discovery Cohort, 968 (85.7%) met QC standards (S1 Fig and S1 Table). These 968 proteins were retained for downstream analyses.

Proteins differentiating PD from NC samples in the Discovery Cohort were nominated using a linear model associating concentration of each of these 968 proteins with disease state, adjusting for LEDD [29], age at plasma draw, and sex, generating an initial candidate list of 140 biomarkers associated at a nominal p-value < 0.005 with PD; correction for multiple hypothesis testing using the Benjamini-Hochberg method [20] demonstrated that all 140 proteins met the additional criterion of associating with disease state with an FDR-corrected p-value < 0.05 (S2 Table).

We next performed hierarchical clustering on these candidate markers to evaluate the correlation structure between groups of proteins and disease state. Unsupervised clustering revealed colinearity among subsets of these proteins, suggesting redundancies and possible shared relationships among many candidate biomarkers (Fig 2A). We thus employed Stability Selection [22], a meta-statistical tool that identifies consistently important features by repeated subsampling of the data, in order to identify the most robust, stable, and sparse set of discriminatory proteins; we ranked candidate biomarkers using the LASSO method across 100,000 jackknifed iterations [23,24]. The top 10 proteins from the Discovery Cohort ranked by Stability Selection, shown in Fig 2B and 2C, were advanced for replication.

**Fig. 2. Identification of proteins differentiating PD and NC samples in the Discovery Cohort.**

Replication of biomarker associations with PD in the PDBP Cohort

We next tested our top 10 stability-ranked markers for robustness in a separate Replication Cohort of 215 PD and 102 NC participants drawn from the multicenter PDBP [13] cohort (Table 1 and S3 Table). Analytical methods were identical to those used in the Discovery Cohort except that the Replication Cohort analysis additionally included clinical site and batch as covariates and did not include LEDD as a covariate, since LEDDs were not universally available for PDBP participants.

Despite the inevitable introduction of variability from a multisite, multibatch Replication Cohort, with slight differences in clinical data availability, four of the top 10 proteins that differed in PD versus NC samples in the Discovery Cohort also differed between PD and NC, with the same direction of effect, in the PDBP Replication Cohort (Fig 3). These protein biomarkers were bone sialoprotein (BSP, Discovery FDR-corrected p = 2.82 × 10⁻², Replication FDR-corrected p = 1.03 × 10⁻⁴), osteomodulin (OMD, Discovery FDR-corrected p = 2.14 × 10⁻², Replication FDR-corrected p = 9.14 × 10⁻⁵), aminoacylase-1 (ACY1, Discovery FDR-corrected p = 1.86 × 10⁻³, Replication FDR-corrected p = 2.18 × 10⁻²), and growth hormone receptor (GHR, Discovery FDR-corrected p = 3.49 × 10⁻⁴, Replication FDR-corrected p = 2.97 × 10⁻³, S4 Table).

**Fig. 3. Blood-based biomarkers found in both Discovery and Replication Cohorts.**

Biomarker measures in ALS, a neurodegenerative disease with motor and cognitive features

To determine whether each of these plasma proteins specifically characterize PD or whether they are seen across many neurodegenerative disease states, we additionally measured these proteins in 59 individuals with ALS (Table 1), a neurodegenerative disease that, like PD, has both motor and cognitive features. As shown in Fig 3 and corroborated by multiple linear regression adjusting for age, sex, and clinical site, with the exception of BSP, protein changes were seen in PD but not in ALS.

Preanalytical variability and biomarker measures

Most individuals with PD are treated with dopaminergic medication, raising the concern that medication-based effects on the plasma proteome may be driving our biomarker signals. We addressed this concern in two ways. First, in our PDBP Replication Cohort, a subset of individuals with PD (n = 18) had never been treated with dopaminergic medications. We compared values of ACY1, OMD, GHR, and BSP in people with PD treated with dopaminergic medication versus those never treated with dopaminergic medications, and we found no significant differences (Wilcoxon test nominal p-value > 0.05 for all four proteins comparing PD treated versus not treated with dopaminergic medication, Fig 3). Second, we investigated samples from an additional multisite cohort—the BioFIND Study [14]—in which PD participants had blood drawn in two settings: (1) while on their customary dopaminergic medications and (2) after overnight washout of medication (S5 Table). Although some plasma proteins may be affected by medication state, none of our four biomarker proteins changed substantially when comparing ON medication and OFF medication states in the same individual (Fig 4A, S2 Fig).

**Fig. 4. Top biomarkers are robust and predict cognitive trajectory.**

To understand whether candidate protein biomarkers are robust to common sources of preanalytical variability, we investigated identical aliquots of pooled plasma samples from the Discovery Cohort (UPenn reference samples pool) and the PDBP Cohort (PDBP reference samples pool) that were subjected to differences in sample handling (with versus without 30 minutes at room temperature followed by an additional freeze–thaw of the sample). Whereas some proteins changed their levels by >30% based on differences in sample handling, none of our top proteins changed substantially in either pool (Fig 4B and 4C, S5 Table).

GHR, ACY1, and OMD as predictors of cognitive decline

We next asked whether baseline levels of our candidate biomarkers predicted disease progression. Because cognitive symptoms are less affected by dopaminergic medication than motor symptoms, and because decline in cognition is variable but clinically important in PD [30], we investigated whether baseline levels of BSP, OMD, ACY1, or GHR predicted subsequent rates of cognitive decline.

In our extensively characterized Discovery Cohort participants, who have been followed for an average of 3.5 years after plasma sampling, cross-sectional analyses revealed minimal association between plasma biomarker levels and baseline cognitive scores on the multidomain DRS, which has been used extensively for cognitive assessment in PD [31] (Fig 4D). However, plasma levels of GHR, ACY1, and OMD predicted subsequent rates of cognitive change on the DRS in mixed-effects linear models adjusting for age, sex, disease duration, and baseline DRS score, with time-by-protein interaction coefficients of 0.0905 (GHR, FDR-corrected p = 8.72 × 10⁻⁶), 0.0478 (ACY1, FDR-corrected p = 2.574 × 10⁻²), and −0.0457 (OMD, FDR-corrected p = 2.574 × 10⁻²), respectively. Moreover, individuals with baseline levels of GHR and ACY1 in the lowest tertile were significantly more likely to clinically progress to MCI or dementia in Cox proportional hazards analyses adjusting for age, sex, and disease duration (hazard ratio [HR] 2.27 [95% CI 1.04–5.0, p = 0.04] for GHR [Fig 4E and 4F], and HR 3.0 [95% CI 1.24–7.0, p = 0.014] for ACY1 [S3 Fig]). Finally, correcting for education in our models did not affect our results (S6 Table, S3 Fig).

The PDBP Replication Cohort is less mature in follow-up than our Discovery Cohort. In addition, cognitive testing data are more limited, with variability among PDBP sites with respect to their collection of cognitive data and stage of PD. These limitations notwithstanding, scores on the MoCA were obtained at 6-month intervals for an average 2.8-year follow-up period for 74 PD individuals from the PDBP Replication Cohort, followed at UTSW. In these participants, we classified each individual as having normal cognition, MCI, or dementia for each time point according to published norms for the MoCA [26]. Using the same Cox proportional hazards models (i.e., adjusted for age, sex, and disease duration) as in the Discovery Cohort, we found that individuals with baseline levels of GHR in the lowest tertile were more likely to progress to MoCA scores in the MCI or dementia range (HR 3.6 [95% CI 1.20–11.1, p = 0.02]) in the Replication Cohort as well (Fig 4G and 4H). Moreover, just as in the Discovery Cohort, additional correction for education in our model did not affect results (S4 Fig).

Discussion

In this study, we investigated multiple cohorts in a discovery–replication design to develop novel PD biomarkers, starting from an unbiased screen of approximately 1,000 plasma proteins. We found four top biomarker candidates—ACY1, BSP, GHR, and OMD—that replicated across a single-site Discovery Cohort and a multisite Replication Cohort, were robust to common sources of preanalytical variability, and did not differ in paired samples from PD participants on versus off dopaminergic medication. In analyses of longitudinal data, we showed that baseline levels of ACY1 and GHR—and, to a lesser extent, OMD—associated with subsequent rates of cognitive decline in our Discovery Cohort, with baseline GHR predicting subsequent cognitive course in the Replication Cohort as well.

The PD biomarkers found here have not, to our knowledge, been previously reported in the neurodegenerative disease literature. However, unbiased screens—most commonly exemplified by the genome-wide association study in human genetics—often yield unexpected new directions for investigation [6]. We note, however, that GHR and insulin-like growth factor (IGF-1, a well-known effector produced in response to growth hormone [GH]-GHR signaling), are expressed in the brain [32,33] and have been implicated in both physiological and pathological events in the brain. GH-GHR-IGF-1 signaling has been implicated in neural stem cell differentiation and proliferation during embryonic development [33–35], adult neurogenesis in rodents [36,37], age-related cognitive decline [38,39], and neuroprotection against neurological insults such as hypoxic-ischemic injury [40,41], pointing toward potential links between this pathway and protection from neurodegeneration. Future studies using mendelian randomization techniques [42] or manipulation of biomarker levels in model systems are needed, however, to truly elucidate potential mechanisms leading to the biomarker signatures described here.

Strengths of this study include attention to reproducibility, as well as consideration of real-world factors that influence downstream translational potential. With respect to reproducibility, we highlight four aspects. First, the ranking of top candidate proteins from the Discovery Cohort by Stability Selection, rather than strict ordering by p-value, guards against concerns regarding overfitting. Second, biomarkers described here had FDR-corrected p-values < 0.05 in both the single-site Discovery and multisite PDBP Replication Cohorts, attesting to the robustness of our findings. Third, the analysis strategy in the Replication Cohort was prespecified to mirror that of the Discovery Cohort, with only two differences: (1) the inclusion of site and batch as additional covariates, justified by the move from single-site/single-batch to multisite/multibatch phases of analysis, and (2) the removal of LEDD as a covariate, necessitated by lack of these data uniformly across all Replication Cohort participants. Fourth, we note that baseline levels of GHR, ACY1, and, to a lesser extent, OMD predicted future cognitive decline in individuals with PD from the Discovery Cohort. Moreover, despite differences in cognitive scale and clinical site used, as well as stage of PD assessed (all factors known to affect measures of cognition over time), lower levels of GHR also predicted faster cognitive decline in the Replication Cohort. Aside from meeting a clear need for biomarkers predicting PD progression [7], the association of the same proteins with disease class as well as disease progression increases confidence in these biomarker candidates, since the gradation of levels within PD according to one measure of pathophysiological severity (rate of cognitive decline) suggests that the differences between PD and NC are not due to a hidden confounding variable differentiating these two groups. With respect to downstream translational considerations, we investigated aspects of real-world variability, demonstrating that all four top biomarkers reported here are not substantially affected by dopaminergic medication state or common sources of noise related to sample handling. We also emphasize the fact that our biomarker candidates are measured from the blood plasma, allowing for collection in any routine phlebotomy setting.

Our study also has limitations. First, our study relies on an aptamer-based platform [9] for plasma protein measures. Although this is a powerful approach for large-scale screening, downstream translation will likely require development of alternative protein assays that (1) yield absolute protein quantities rather than the RFUs analyzed here and (2) confirm assay specificity. Second, although we have adjusted for dopaminergic medication effects where possible and directly analyzed the effect of dopaminergic medication state on protein measures, our study cannot rule out small effects of dopaminergic medication on candidate protein measures, since overnight washout of dopaminergic medication does not fully mitigate medication effects. Thus, evaluation of candidate protein biomarkers in unmedicated early symptomatic or even presymptomatic, high-risk cohorts is a fruitful future avenue. Third, although we have demonstrated that plasma levels of ACY1, GHR, and OMD are to some extent specific to PD, in that they are not similarly changed in ALS, it is still possible that some of these protein biomarkers may show similar changes in other neurodegenerative diseases that were not tested. We note, however, that increasing appreciation for the overlap of pathology across various neurodegenerative diseases—individuals with PD, for example, are highly likely to have concomitant Alzheimer’s disease (AD) neuropathology at autopsy [43]—suggests that overlap in biomarkers across current clinical categories may reflect overlap in pathophysiological mechanism, rather than a poor biomarker. Fourth, for our longitudinal analyses, we assessed cognitive change in order to understand whether candidate biomarkers predicted disease progression. We chose to investigate cognitive decline both because of the major morbidity associated with this aspect of disease progression and because cognition is not as affected by dopaminergic medication as motor performance. Because the majority of PD participants studied here were assessed while taking dopaminergic medication, motor performance would be expected to reflect not only underlying disease state over time (what we aim to measure) but also medication response and timing of most recent dose of medication, adding considerable noise. Thus, although our study found that several of these candidate biomarkers may predict disease progression along one axis (cognitive decline), whether they also predict motor progression is an open question—one that might also be answered by future study in early symptomatic PD individuals not yet taking dopaminergic medication.

In summary, we present our findings from unbiased screening of >1,000 plasma proteins in multiple PD cohorts (a single-site Discovery Cohort, the multicenter PDBP Replication Cohort, and the multicenter BioFIND Cohort), as well as disease and normal controls. In particular, we have identified four plasma proteins—BSP, OMD, ACY1, and GHR—with consistent alterations in PD, one of which (GHR) also predicted subsequent cognitive decline in multiple cohorts, across multiple cognitive testing instruments. Our results open up new avenues for mechanistic investigation, suggesting that "near-proteomic" profiling of blood from individuals with PD may be a powerful approach both for the development of clinical tools and for insight into the pathophysiology of this currently incurable disease.

Supporting information

S1 Fig [a]
QC measures.

S2 Fig [tif]
Effect of levodopa therapy on plasma protein levels tested by paired permutation test.

S3 Fig [g]
Plasma levels of GHR and ACY1 predict cognitive decline in individuals with PD from Discovery Cohort.

S4 Fig [utsw]
Plasma levels of GHR predict cognitive decline in individuals with PD from Replication Cohort.

S1 Table [xlsx]
SOMAScan levels (log of RFU) of 1,129 proteins, and demographic data for 141 participants (96 PD and 45 NC) from Discovery (Udall) cohort.

S2 Table [xlsx]
Candidate PD biomarkers from Discovery Cohort analysis (140 proteins).

S3 Table [xlsx]
SOMAScan levels (log of RFU) of 1,305 proteins and demographic data for 376 research participants (59 ALS, 215 PD, and 102 NC) from Replication Cohort.

S4 Table [docx]
Multiple regression FDR-adjusted -values for top 10 proteins in Discovery and Replication Cohort.

S5 Table [rfu]
Change in protein levels in ON versus OFF dopaminergic state, and after systematic perturbation of samples of extra freeze–thaw and prolonged (30 minutes) room temperature exposure.

S6 Table [docx]
Results from mixed-effects linear models in Discovery Cohort.

S1 STROBE Checklist [docx]
STROBE, Strengthening the Reporting of Observations Studies in Epidemiology.

S1 Analysis Plan [docx]

Zdroje

1. Hughes AJ, Daniel SE, Blankson S, Lees AJ. A Clinicopathologic Study of 100 Cases of Parkinson’s Disease. Arch Neurol. 1993;50(2):140–8. doi: 10.1001/archneur.1993.00540020018011 8431132

2. Fearnley JM, Lees AJ. Ageing and Parkinson’s Disease: Substantia Nigra Regional Selectivity. Brain. 1991;114(5):2283–301.

3. Ravina B, Eidelberg D, Ahlskog JE, Albin RL, Brooks DJ, Carbon M, et al. The role of radiotracer imaging in Parkinson disease. Neurology. 2005;64(2):208–15.

4. Tropea TF, Chen-Plotkin AS. Unlocking the mystery of biomarkers: A brief introduction, challenges and opportunities in Parkinson Disease. Park Relat Disord. 2018;46(Suppl 1):S15–8.

5. Lewczuk P, Riederer P, O’Bryant SE, Verbeek MM, Dubois B, Visser PJ, et al. Cerebrospinal fluid and blood biomarkers for neurodegenerative dementias: An update of the Consensus of the Task Force on Biological Markers in Psychiatry of the World Federation of Societies of Biological Psychiatry. World J Biol Psychiatry. 2018;19 : 244–328. doi: 10.1080/15622975.2017.1375556 29076399

6. Chen-Plotkin AS. Unbiased approaches to biomarker discovery in neurodegenerative diseases. Neuron. 2014;84(3):594–607. doi: 10.1016/j.neuron.2014.10.031 25442938

7. Chen-Plotkin AS, Albin R, Alcalay R, Babcock D, Bajaj V, Bowman D, et al. Finding useful biomarkers for Parkinson’s disease. Sci Transl Med. 2018;10(454):eaam6003. doi: 10.1126/scitranslmed.aam6003 30111645

8. Kang J-H, Irwin DJ, Chen-Plotkin AS, Siderowf A, Caspell C, Coffey CS, et al. Association of Cerebrospinal Fluid β-Amyloid 1–42, T-tau, P-tau 181, and α-Synuclein Levels With Clinical Features of Drug-Naive Patients With Early Parkinson Disease. JAMA Neurol. 2013;70(10):1277–87. doi: 10.1001/jamaneurol.2013.3861 23979011

9. Gold L, Ayers D, Bertino J, Bock C, Bock A, Brody EN et al. Aptamer-Based Multiplexed Proteomic Technology for Biomarker Discovery. PLoS ONE. 2010;5(12):e15004. doi: 10.1371/journal.pone.0015004 21165148

10. R Core Team. A language and environment for statistical computing, R Foundation for Statistical Computing [Internet]. 2016. p. 1. https://www.r-project.org/ [cited 2018 Jul 15].

11. Tropea TF, Xie SX, Rick J, Chahine LM, Dahodwala N, Doshi J, et al. APOE, thought disorder, and SPARE-AD predict cognitive decline in established Parkinson’s disease. Mov Disord. 2018;33(2):289–97. doi: 10.1002/mds.27204 29168904

12. Chen-Plotkin AS, Hu WT, Siderowf A, Weintraub D, Gross G, Hurtig HI, et al. Plasma EGF levels predict cognitive decline in Parkinson’s disease. 2011;69(4):655–63.

13. Rosenthal LS, Drake D, Alcalay RN, Babcock D, Bowman FD, Chen-Plotkin A, et al. The NINDS Parkinson’s disease biomarkers program. Mov Disord. 2016;31(6):915–23. doi: 10.1002/mds.26438 26442452

14. Kang UJ, Goldman JG, Alcalay RN, Xie T, Tuite P, Henchcliffe C, et al. The BioFIND study: Characteristics of a clinically typical Parkinson’s disease biomarker cohort. Mov Disord. 2016;31(6):924–32. doi: 10.1002/mds.26613 27113479

15. Davies DR, Gelinas AD, Zhang C, Rohloff JC, Carter JD, O’Connell D, et al. Unique motifs and hydrophobic interactions shape the binding of modified DNA ligands to protein targets. Proc Natl Acad Sci U S A. 2012;109(49):19971–6. doi: 10.1073/pnas.1213933109 23139410

16. Dewey TM, Zyzniewski MC, Mundt AA, Crouch GJ, Eaton BE. New Uridine Derivatives for Systematic Evolution of RNA Ligands by Exponential Enrichment. J Am Chem Soc. 1995;117(32):8474–5.

17. Kraemer S, Vaught JD, Bock C, Gold L, Katilius E, Keeney TR, et al. From SOMAmer-based biomarker discovery to diagnostic and clinical applications: A SOMAmer-based, streamlined multiplex proteomic assay. PLoS ONE. 2011;6(10):e26332. doi: 10.1371/journal.pone.0026332 22022604

18. Lollo B, Steele F, Gold L. Beyond antibodies: New affinity reagents to unlock the proteome. Proteomics. 2014;14(6):638–44. doi: 10.1002/pmic.201300187 24395722

19. SomaLogic. SOMAscan Proteomic Assay Technical White Paper. 2015;1–14.

20. Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57(1):289–300.

21. Warnes, RG, Bolker B, Bonebakker L, Gentleman R, Huber W, Liaw A et al. gplots: Various R Programming Tools for Plotting Data [Internet]. 2016. https://cran.r-project.org/package=gplots [cited 2018 Jul 14].

22. Meinshausen N, Bühlmann P. Stability selection. J R Stat Soc Ser B Stat Methodol. 2010;72(4):417–73.

23. Tibshirani R. Regression shrinkage and selection via the lasso: A retrospective. J R Stat Soc Ser B Stat Methodol. 2011;73(3):273–82.

24. Wehrens R, Franceschi P. Meta-Statistics for Variable Selection: The R Package BioMark. J Stat Softw. 2012;51(10):1–18.

25. Pinheiro J, Bates D, DebRoy S. SD and RCT. _nlme: Linear and Nonlinear Mixed Effects Models_. R package version 3.1–137. 2018. https://CRAN.R-project.org/package=nlme [cited 2018 Oct 14].

26. Dalrymple-Alford JC, MacAskill MR, Nakas CT, Livingston L, Graham C, Crucian GP, et al. The MoCA. Neurology. 2010;75(19):1717–1725. doi: 10.1212/WNL.0b013e3181fc29c9 21060094

27. Therneau T. A Package for Survival Analysis in S. version 2.38. 2015. https://CRAN.R-project.org/package=survival [cited 2018 Oct 14].

28. Kassambara A. survminer: Drawing Survival Curves using “ggplot2”. R package version 0.4.2. 2018. https://CRAN.R-project.org/package=survminer [cited 2018 Sep 16].

29. Tomlinson CL, Stowe R, Patel S, Rick C, Gray R, Clarke CE. Systematic review of levodopa dose equivalency reporting in Parkinson’s disease. Mov Disord. 2010;25(15):2649–53. doi: 10.1002/mds.23429 21069833

30. Reid WGJ, Hely MA, Morris JGL, Loy C, Halliday GM. Dementia in Parkinson’s disease: A 20-year neuropsychological study (Sydney multicentre study). J Neurol Neurosurg Psychiatry. 2011;82(9):1033–7. doi: 10.1136/jnnp.2010.232678 21335570

31. Llebaria G, Pagonabarraga J, Kulisevsky J, García-Sánchez C, Pascual-Sedano B, Gironell A et al. Cut-off score of the Mattis Dementia Rating Scale for screening dementia in Parkinson’s disease. Mov Disord. 2008;23(11):1546–50. doi: 10.1002/mds.22173 18546326

32. Castro J, Costoya J, Señarís R, Arce V, Prieto A, Gallego R. Expression of growth hormone receptor in the human brain. Neurosci Lett. 2002;281(2–3):147–50.

33. Ajo R, Sánchez-Franco F, Navarro C, Cacicedo L. Growth Hormone Action on Proliferation and Differentiation of Cerebral Cortical Cells from Fetal Rat. Endocrinology. 2003;144(3):1086–97. doi: 10.1210/en.2002-220667 12586785

34. Turnley AM, Faux CH, Rietze RL, Coonan JR, Bartlett PF. Suppressor of cytokine signaling 2 regulates neuronal differentiation by inhibiting growth hormone signaling. Nat Neurosci. 2002;5 : 1155–1162. doi: 10.1038/nn954 12368809

35. McLenachan S, Lum MG, Waters MJ, Turnley AM. Growth hormone promotes proliferation of adult neurosphere cultures. Growth Horm IGF Res. 2009;19(3):212–8. doi: 10.1016/j.ghir.2008.09.003 18976947

36. Åberg ND, Johansson I, Åberg MAI, Lind J, Johansson UE, Cooper-Kuhn CM, et al. Peripheral administration of GH induces cell proliferation in the brain of adult hypophysectomized rats. J Endocrinol. 2009;201(1):141–50. doi: 10.1677/JOE-08-0495 19171566

37. Åberg ND, Lind J, Isgaard J, Kuhn HG. Peripheral growth hormone induces cell proliferation in the intact adult rat brain. Growth Horm IGF Res. 2010;20(3):264–9. doi: 10.1016/j.ghir.2009.12.003 20106687

38. Frater J, Lie D, Bartlett P, McGrath JJ. Insulin-like Growth Factor 1 (IGF-1) as a marker of cognitive decline in normal ageing: A review. Ageing Res Rev. 2018;42 : 14–27. doi: 10.1016/j.arr.2017.12.002 29233786

39. Muller AP, Fernandez AM, Haas C, Zimmer E, Portela LV, Torres-Aleman I. Reduced brain insulin-like growth factor I function during aging. Mol Cell Neurosci. 2012;49(1):9–12. doi: 10.1016/j.mcn.2011.07.008 21807098

40. Christophidis LJ, Gorba T, Gustavsson M, Williams CE, Werther GA, Russo VC, et al. Growth hormone receptor immunoreactivity is increased in the subventricular zone of juvenile rat brain after focal ischemia: A potential role for growth hormone in injury-induced neurogenesis. Growth Horm IGF Res. 2009;19(6):497–506. doi: 10.1016/j.ghir.2009.05.001 19524466

41. Guo SZ, Raccurt M, Brittian KR, Moudilou E, Li RC, Morel G, et al. Exogenous growth hormone attenuates cognitive deficits induced by intermittent hypoxia in rats. Neuroscience. 2011;196 : 237–50. doi: 10.1016/j.neuroscience.2011.08.029 21888951

42. Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014;23(R1):R89–98. doi: 10.1093/hmg/ddu328 25064373

43. Robinson JL, Lee EB, Xie SX, Rennert L, Suh E, Bredenberg C, et al. Neurodegenerative disease concomitant proteinopathies are prevalent, age-related and APOE4-associated. Brain. 2018;141(7):2181–93. doi: 10.1093/brain/awy146 29878075