As DNA tests for ancestry explode in popularity, a fundamental problem remains: The tests deliver more detailed results for people of European descent, as evidenced by the ethnicities and data that major DNA testing companies represent. While this bias should recede as more people take the tests and add their DNA data to the mix, the companies have some work to do before their kits can work reasonably well on a worldwide population.
In 2017, more people took DNA tests than in all the previous years combined, according to the MIT Technology Review, and that number keeps climbing. According to the International Society of Genetic Genealogy (ISOGG), more than 18 million people have tested their DNA to learn about their ethnic identity or to find relatives. DNA testing companies like AncestryDNA and 23andMe have become household names as a result, while new tests claiming more specialized results crop up every few years.
It’s easy to see the appeal. For $99, 23andMe and AncestryDNA simply require that you spit in a cup, send it off to a lab for testing, and then wait a matter of weeks to learn the ethnic breakdown of your genes by region. (See our comparison of these two popular kits.)
The data problem
The risk for racial bias starts with the data used by DNA tests. AncestryDNA, for instance, bases its ethnicity estimate on a reference panel sourced from the DNA of 16,638 people representing 43 different populations. The people in the reference panel are screened to ensure they represent a certain ethnicity strongly—“people with a long family history in one place or within one group,” the company explains. The screening involves controls, such as removing close relatives, to avoid skewing the ethnicity profile.
While this pre-screened data can identify ethnicity on a broad level, more detail comes only with more data. Every DNA test kit sent in adds to the company’s database. That’s why leading contenders AncestryDNA and 23andMe have some of the best estimates available—they have more customers, and therefore more data.
Because DNA tests like AncestryDNA and 23andMe were at first available only in the United States and have expanded mostly to European countries or former colonies, the customer base continues to be fairly uniform. ISOGG estimates that four-fifths of the people who have taken DNA tests are U.S. citizens, meaning their data reflects a population with majority European ancestry.