Performance evaluation of computational programs to predict Human leucocyte antigen (HLA) genotypes on African whole exome sequence data

dc.contributor.author Agaba, Gerald Muzorah
dc.date.accessioned 2023-10-18T09:43:12Z
dc.date.available 2023-10-18T09:43:12Z
dc.date.issued 2023
dc.description Dissertation submitted to the Directorate of Research and Graduate Training in partial fulfillment of the award of Master of Immunology and Clinical Microbiology of Makerere University. en_US
dc.description.abstract Introduction: Accurate Human Leucocyte Antigen (HLA) genotyping is critical in studies involving the immune system and transplantation medicine. HLA genes encode proteins that have a central role in immune responses that are associated with rejection during organ transportation. HLA genotyping techniques have evolved from simple serologic methods to molecular analysis and recently the use of computational programs to type HLA genotypes using next-generation sequencing (NGS) data. Methodology This was a cross-sectional study nested in the Collaborative African Genomics Network (CAfGEN), HLA genotypes were made using six computational tools on 204 samples (124 from Baylor Uganda and 80 from Baylor Botswana). Objective The general objective of our study was to evaluate the technical performance of six computational programs/tools to predict HLA genotypes on African whole exome sequence data. Results: Regardless of the type of tool, each had the ability to call the three HLA class genotypes (A, B and C) and alleles to the second field (4-digit resolution) HLA-HD and Polysolver made call up to the third field (six-digit resolution), and was able to call 98.2% and 79.74% respectively of the samples. HISAT-genotype and HLA-VBSeq on the other hand were able to call up to 4th field (eight digit-resolution), calling 90.6% and 52.82% respectively of alleles. The highest percentage agreement between the tools at the genotype level was observed in HLA-C genotype, with HISAT-genotype having the Utmost agreement, and the lowest in HLAminer (89.71% vs 80.15% respectively), HLA-A subclass had the overall poorest agreement between the tools with the highest value observed with Polysolver (81.91%) and lowest with HLAminer (55.25%) compared to the six other tools, HLA-B subtype calls had on average slightly lower percentage agreement compare to HLA-C, with the highest percentage agreement observed with both Polysolver and HLA-HD at 88.53% and lowest with HLA-VBSeq at 69.75%. In terms of run time, Optitype had the shortest execution time of 7.37 (±1.82) minutes, followed by HLAminer at 19.31 (±4.61) minutes, HISAT-genotype at 31.48 (±6.64) minutes. HLA-VBSeq 59.52 (±12.6), HLA-HD at 85.92 (±31.41) minutes. Polysolver had the overall longest execution time of 97.10 (±7.91). Conclusion: The evaluation of these six publicly available HLA typing tools can help researchers to decide on the most appropriate HLA typing method for their NGS data and promote further development of high-performance tools, in this study we recommend the combination of different tools to achieve better results. en_US
dc.identifier.citation Agaba, G.M. (2023). Performance evaluation of computational programs to predict Human leucocyte antigen (HLA) Genotypes on African whole exome sequence data. (Unpublished master's dissertation). Makerere University, Kampala, Uganda. en_US
dc.identifier.uri http://hdl.handle.net/10570/12236
dc.language.iso en en_US
dc.publisher Makerere University en_US
dc.subject Performance Evaluation en_US
dc.subject Computational Programs en_US
dc.subject Genotypes en_US
dc.subject Human leucocyte antigen en_US
dc.subject HLA en_US
dc.title Performance evaluation of computational programs to predict Human leucocyte antigen (HLA) genotypes on African whole exome sequence data en_US
dc.type Thesis en_US
Files