VIETNAMESE-SPECIFIC DISEASE-ASSOCIATED VARIANTS AND SNP CHIP

INTRODUCTION

Vietnam-Specific disease-associated variants                                                                                                                                                                                                  Since its inception 13 years ago, the area of genome-wide association studies (GWAS) has demonstrated that it is capable of identifying significant genetic markers.   With more than 78% in the UK biobank and over 85% in the GTEx (Genotype-Tissue Expression) project, two of the largest population genetic projects until 2018, the majority of GWAS studies, however, have focused on populations of European ancestry. Estimates of the effect of genetics on human qualities, such as sickness, will be more precise for the groups that have been the subject of the most research. Inequalities in global pharmacology and customized medicine are thus raising rising concerns due to the significant bias in the usage of human genome information between groups in medical research. Initiatives like 1KVG are a trailblazer in eliminating inequality. This project's ability to find disease-related variations unique to Vietnamese people variants that have not previously been identified in studies that have traditionally favored people of European descent is a noteworthy outcome.                                                                                                                                                                                                                                     

SNP chip                                                                                                                                                                                                                                                                            Quantitative genetics' significant field of genomic prediction employs thousands to millions of DNA variants in a model to predict phenotypes. 11 As a result, phenotypic predictions are possible. A genomic prediction has greatly aided the livestock business by utilizing a combination of variations with side effects. 12 Visscher et al. have made significant contributions to human population genetics by employing genome-wide variations to determine the heritability of complex characteristics like height and neurological illnesses in humans. 

For Mendelian hereditary disorders, where just one or a few genes are involved in the development of the disease phenotype, a number of commercial SNP chips have been developed. However, the majority of prevalent illnesses are thought to represent complicated genetic features.                                                                                                                                                                        

Overview of LmTag. (i) Imputation accuracy modeling, this includes modeling imputation accuracy metric as a function of LDs, MAFs and
genomic distances. (ii) Functional scoring, this includes steps of weighting functional scores of SNPs based on public databases. (iii) Functional tag SNP
selection, imputation capability of each SNP is represented as triangles while functional scores are showed in the lower rectangles. When K = 1, the
beam search algorithm becomes the best-first search that select SNP with the highest estimated imputation performance - colored bold red triangles.
When K > 1, the algorithm selects top K SNPs with the highest estimated imputation performances – colored light pink triangles, the functional scores
in these SNPs – colored light green is weighted to find the highest functional SNPs as tag SNPs – colored bold red triangles.

Overview of LmTag. (i) Imputation accuracy modeling, this includes modeling imputation accuracy metric as a function of LDs, MAFs and genomic distances. (ii) Functional scoring, this includes steps of weighting functional scores of SNPs based on public databases. (iii) Functional tag SNP selection, imputation capability of each SNP is represented as triangles while functional scores are showed in the lower rectangles. When K = 1, the beam search algorithm becomes the best-first search that select SNP with the highest estimated imputation performance - colored bold red triangles. When K > 1, the algorithm selects top K SNPs with the highest estimated imputation performances – colored light pink triangles, the functional scores in these SNPs – colored light green is weighted to find the highest functional SNPs as tag SNPs – colored bold red triangles.

PRODUCT DESCRIPTION

Regardless of the rapid development of next-generation sequencing technologies, microarray-based genotyping combined with the imputation of untyped variants remains a cost-effective means to interrogate genetic variations across the human genome. This technology is widely used in genome-wide association studies (GWAS) at bio-bank scales, and more recently, in polygenic score (PGS) analysis to predict and stratify disease risk. In principle, genotyping SNP arrays are designed by selecting a set of SNPs, commonly referred to as “tag SNPs”, which maximize coverage of un-genotyped DNA variants through associations between these alleles in the population (known as linkage disequilibrium, LD). The fact that the majority of human genetic variants are rare and population-specific demands customizing SNP arrays to improve over those designed for global or super populations. In this project, we aim to utilize the availability of the 1000 Vietnamese genome project with an extensive LD map to customize a novel SNP array for Vietnamese with higher imputation performance. We also leverage the functional variants identified in VGP project to increase the number of likely causal variant in the SNP array content. The new SNP array is expected to provide an accurate platform for genotyping genomes of Vietnamese at an affordable cost.