Abstract
Background: Rare variant detection in genome-wide association studies (GWAS) and sequencing analyses is critical for understanding complex diseases, yet accuracy is compromised in diverse populations due to inadequate representation in reference panels. Standard panels overrepresent European ancestry, leading to imputation errors and reduced power for rare variants in non-European groups.Methods: We developed ethnicity-aware reference panels by aggregating sequencing data from multiple ancestries (African, East Asian, South Asian, European, and Hispanic/Latino) and evaluated imputation accuracy using leave-one-out cross-validation and simulated rare variants. We compared performance against cosmopolitan panels (e.g., 1000 Genomes Phase 3) and population-specific panels using metrics including imputation quality score (R²), allelic correlation (r), and rare variant detection sensitivity.Results: Ethnicity-aware panels improved imputation accuracy for rare variants (minor allele frequency Conclusions: Ethnicity-aware reference panels substantially enhance rare variant imputation in diverse populations, reducing bias in downstream association studies. Our findings advocate for the construction of large, ancestrally diverse reference panels to ensure equitable genomic discovery.