The predictive utility of polygenic risk scores for chronic kidney disease in Africans
Abstract
Background: Genome-wide association studies (GWAS) have significantly expanded our understanding of the genetic basis of kidney function, with most findings reported in individuals of European ancestry. Such biased sampling has resulted into a general portability problem, where findings made in one ancestry cannot accurately be transferred to individuals in another ancestry. This shortfall has been especially observed in polygenic risk scores (PRS) and discovery of variants associated with disease traits in Africa. To address this disparity, we aimed to: (1) identify susceptibility loci associated with estimated glomerular filtration rate (eGFR) in 80027 individuals of African ancestry (AFR) from the UK Biobank (UKBB), Million Veteran Program (MVP), and Chronic Kidney Disease genetics (CKDGen) consortium, (2) explore the utility of polygenic risk scores (PRS) for serum creatinine eGFR in continental Africans in the Uganda Genome Resource (UGR), (3) assess the causal association between genetically proxied lipid traits and eGFR, and (4) conduct an in-silico study to differentiate potentially harmful single-nucleotide polymorphisms (SNPs) and neutral ones in the UMOD gene, which is causally linked to chronic kidney disease.
Methods: We applied a multi-faceted approach, combining traditional GWAS meta-analysis, PRS methodologies, Mendelian randomization (MR) approaches, and in silico analyses. In the first specific objective, we meta-analyzed eGFRcrea GWAS summary statistics from 80027 African ancestry individuals and further determined the most likely causal SNPs by a Bayesian fine-mapping approach. We further determined the association between the lead variants and other traits or phenotypes by conducting a phenome-wide association (PheWAS) analysis. For the second objective, we computed a PRS using a large discovery dataset of African ancestry individuals, trained, and validated in continental Africans within the Uganda Genome Resource (UGR) individuals. Thirdly, we performed a two-sample MR analysis to determine the causal effect between lipid traits and eGFR. Lastly, we used multi-computational methods to determine the effect of deleterious SNPs on the structure and function UMOD.
Results: We identified eight lead SNPs, one of which was a novel variant, rs77408001 in the ELN gene. Through fine-mapping, SNPs rs77121243 and rs201602445 emerged as likely causal variants. Our PRS analysis enhanced the prediction of eGFR in East Africans, accounting for 0.22% of eGFR trait variance using the clumping and thresholding approach and almost doubled (0.42% trait variance) using the PRScs approach. The PRS derived from a European-ancestry dataset did not accurately predict eGFR in continental Africans, as anticipated. Additionally, our analysis revealed intriguing causal associations with lipid traits markers. Univariable Mendelian randomization (MR) analysis unveiled that genetically predicted low-density lipoprotein (LDL) and total cholesterol (TC) had positive causal effects on eGFR, with effect sizes of 1.1 and 1.619, respectively. In the multivariable inverse-variance weighted (MVIVW) analysis, we further affirmed the causal association between LDL and eGFR, with an effect size of 1.228. Triglycerides (TG) also showed a significant causal effect on eGFR, with an effect size of -1.283. However, genetically predicted high-density lipoprotein (HDL-C) did not exhibit a significant causal link in both univariable and multivariable analyses. Furthermore, in silico analysis of the UMOD gene uncovered two non-synonymous single-nucleotide polymorphisms (nsSNPs), namely rs28934582 and rs28934583, which were associated with deleterious point mutations resulting in changes to residue size and hydrophobicity, as predicted by the HOPE tool. Notably, mutation C181Y brought about a shift in charge from neutral to positive. Our analysis of 3D structures through I-TASSER and SWISS model yielded consistent results, with a model C-score of -0.80, an Estimated TM-score of 0.61±0.14, and an estimated RMSD of 9.7±4.6Å. Additionally, the UMOD gene exhibited interactions with 18 other genes, including those associated with kidney function and innate immune genes like IL2, TNF, IL1B, and ALB.
Conclusion: Our findings provide valuable insights into genetic associations with eGFR in East African populations. We demonstrate that larger datasets of individuals of African ancestry can indeed uncover new insights into chronic kidney disease, genetic variants unique to this population, PRSs with better utility, and a potential impact of lipid traits on kidney function. Additionally, the identification of deleterious UMOD mutants provides a foundation for understanding the genetic basis of disease in this gene. These findings contribute to a more comprehensive understanding of kidney function and disease susceptibility, while emphasizing the importance of diversity in genetic studies.