submitted on 2024-10-29, 10:09 and posted on 2024-10-30, 07:27authored byAyah Mohamad Ahmed Ziyada
Efficient genotyping of many individuals allowed us to perform genome-wide association studies (GWAS) for a variety of traits. Genotyping was first achieved using SNP arrays, but arrays are limited to genotyping the less than a million variants present on them. Whole-genome sequencing allows us to assess all variants in the genome, however, despite the decline in sequencing costs in recent years, it is not an affordable approach for GWAS studies. Instead, imputation strategies were developed, which meant that the number of SNPs tested in GWAS from SNP arrays could be increased to 7-8 million, thereby increasing the power to detect genetic associations. Another viable alternative to SNP arrays is low coverage whole-genome sequencing (lcWGS). It is advantageous for genotyping individuals from ethnicities whose variants are underrepresented on genotyping arrays and has been shown to be more effective in GWAS.Imputation strategies were originally tailored for SNP arrays, but most software can handle data from lcWGS. We are planning to perform a GWAS of TNFα inhibitor response for a patient cohort in Qatar using lcWGS; therefore, it is important to know which imputation software performs better in accurately imputing genotypes of individuals from underrepresented backgrounds. Here, we compare the efficiency of two imputation software, namely Beagle and Glimpse, on two Caucasian genomes and three genomes from ethnicities that are underrepresented in reference panels. We found that Glimpse imputes around three times more high-confidence variants than Beagle, and their accuracy is comparable. Therefore, we recommend using Glimpse for imputation of lcWGS of individuals from underrepresented populations.