Identification and Analysis of Error Types in High-Throughput Genotyping
Ewen, KR and Bahlo, M and Treloar, SA and Levinson, DF and Morwy, B and Barlow, JW and Foote, SJ, Identification and Analysis of Error Types in High-Throughput Genotyping, American Journal of Human Genetics, 67, (3) pp. 727-736. ISSN 0002-9297 (2000) [Refereed Article]
Although it is clear that errors in genotyping data can lead to severe errors in linkage analysis, there is as yet no consensus strategy for identification of genotyping errors. Strategies include comparison of duplicate samples, independent calling of alleles, and Mendelian-inheritance-error checking. This study aimed to develop a better understanding of error types associated with microsatellite genotyping, as a first step toward development of a rational error-detection strategy. Two microsatellite marker sets (a commercial genomewide set and a custom-designed fine-resolution mapping set) were used to generate 118,420 and 22,500 initial genotypes and 10,088 and 8,328 duplicates, respectively. Mendelian-inheritance errors were identified by PedManager software, and concordance was determined for the duplicate samples. Concordance checking identifies only human errors, whereas Mendelian-inheritance-error checking is capable of detection of additional errors, such as mutations and null alleles. Neither strategy is able to detect all errors. Inheritance checking of the commercial marker data identified that the results contained 0.13% human errors and 0.12% other errors (0.25% total error), whereas concordance checking found 0.16% human errors. Similarly, Mendelian-inheritance-error checking of the custom-set data identified 1.37% errors, compared with 2.38% human errors identified by concordance checking. A greater variety of error types were detected by Mendelian-inheritance-error checking than by duplication of samples or by independent reanalysis of gels. These data suggest that Mendelian-inheritance-error checking is a worthwhile strategy for both types of genotyping data, whereas fine-mapping studies benefit more from concordance checking than do studies using commercial marker data. Maximization of error identification increases the likelihood of linkage when complex diseases are analyzed.