“A world where people are slotted according to their inborn ability – well, that is Gattaca. That is eugenics.”
This was the assessment of Dr. Catherine Bliss, a sociologist who wrote a new book on social science genetics, when asked by MIT Technology Review about polygenic scores that can predict a person’s intelligence or performance in school. Like a credit score, a polygenic score is statistical tool that combines a lot of information about a person’s genome into a single number. Fears about using polygenic scores for genetic discrimination are understandable, given this country’s ugly history of using the science of heredity to justify atrocities like forcible sterilization. But polygenic scores are not the new eugenics. And, rushing to discuss polygenic scores in dystopian terms only contributes to widespread public misunderstanding about genetics.
Let’s begin with some background on how polygenic scores are developed. In a genome wide-association study, researchers conduct millions of statistical tests to identify small differences in people’s DNA sequence that are correlated with differences in a target outcome (beyond what can attributed to chance or ancestry differences). Successful studies of this sort require enormous sample sizes, but companies like 23andMe are now contributing genetic data from their consumers to research studies, and national biorepositories like U.K. Biobank have put genetic information from hundreds of thousands of people online. When applied to studying blood lipids or myopia, this kind of study strikes people as a straightforward and uncontroversial scientific tool. But it can also be conducted for cognitive and behavioral outcomes, like how many years of school a person has completed. When researchers have finished a genome-wide association study, they are left with a dataset with millions of rows (one for each genetic variant analyzed) and one column with the correlations between each variant and the outcome being studied.
The trick to polygenic scoring is to use these results and apply them to people who weren’t participants in the original study. Measure the genes of a new person, weight each one of her millions of genetic variants by its correlation with educational attainment from a genome-wide association study, and then simply add everything up into a single number. Voila! — you’ve created a polygenic score for educational attainment. On its face, the idea of “scoring” a person’s genotype does immediately suggest Gattaca-type applications. Can we now start screening embryos for their “inborn ability,” as Bliss called it? Can we start genotyping toddlers to identify the budding geniuses among them?
The short answer is no. Here are four reasons why dystopian projections about polygenic scores are out of touch with the current science:
First, a polygenic score currently predicts the life outcomes of an individual child with a great deal of uncertainty. The amount of uncertainty around polygenic predictions will decrease in the future, as genetic discovery samples get bigger and genetic studies include more of the variation in the genome, including rare variants that are particular to a few families. But for now, knowing a child’s polygenic score predicts his ultimate educational attainment about as well as knowing his family’s income, and slightly worse than knowing how far his mother went in school. These pieces of information are also readily available about children before they are born, but no one is writing breathless think-pieces about the dystopian outcomes that will result from knowing whether a pregnant woman graduated from college.
Second, using polygenic scoring for embryo selection requires parents to create embryos using reproductive technology, rather than conceiving them by having sex. The prediction that many women will endure medically-unnecessary IVF, in order to select the embryo with the highest polygenic score, glosses over the invasiveness, indignity, pain, and heartbreak that these hormonal and surgical procedures can entail.
Third, and counterintuitively, a polygenic score might be using DNA to measure aspects of the child’s environment. Remember, a child inherits her DNA from her parents, who typically also shape the environment she grows up in. And, children’s environments respond to their unique personalities and temperaments. One Icelandic study found that parents’ polygenic scores predicted their children’s educational attainment, even if the score was constructed using only the half of the parental genome that the child didn’t inherit. For example, imagine mom has genetic variant X that makes her more likely to smoke during her pregnancy. Prenatal exposure to nicotine, in turn, affects the child’s neurodevelopment, leading to behavior problems in school. The school responds to his behavioral problems with suspension, causing him to miss out on instructional content. A genome-wide association study will collapse this long and winding causal path into a simple correlation — “genetic variant X is correlated with academic achievement.” But, a child’s polygenic score, which includes variant X, will partly reflect his likelihood of being exposed to adverse prenatal and school environments.
Finally, the phrase “DNA tests for IQ” makes for an attention-grabbing headline, but it’s scientifically meaningless. As I’ve written previously, it makes sense to talk about a bacterial test for strep throat, because strep throat is a medical condition defined as having streptococcal bacteria growing in the back of your throat. If your strep test is positive, you have strep throat, no matter how serious your symptoms are. But a polygenic score is not a test “for” IQ, because intelligence is not defined at the level of someone’s DNA. It doesn’t matter how high your polygenic score is, if you can’t reason abstractly or learn from experience. Equating your intelligence, a cognitive capacity that is tested behaviorally, with your polygenic score, a number that is a weighted sum of genetic variants discovered to be statistically associated with educational attainment in a hypothesis-free data mining exercise, is misleading about what intelligence is and is not.
So, if we’re not going to build a Gattaca-style genetic hierarchy, what are polygenic scores good for? They are not useless. In fact, they give scientists a valuable new tool for studying how to improve children’s lives. The task for many scientists like me, who are interested in understanding why some children do better in school than other children, is to disentangle correlations from causation. The best way to do that is to run an experiment where children are randomized to environments, but often a true experiment is unethical or impractical. You can’t randomize children to be born to a teenage mother or to go to school with inexperienced teachers. By statistically controlling for some of the relevant genetic differences between people using a polygenic score, scientists are better able to identify potential environmental causes of differences in children’s life outcomes. As we have seen with other methods from genetics, like twin studies, understanding genes illuminates the environment.
Research that examines genetics in relation to social inequality, such as differences in higher education outcomes, will obviously remind people of the horrors of the eugenics movement. Wariness regarding how genetic science will be applied is certainly warranted. But, polygenic scores are not pure measures of “inborn ability,” and genome-wide association studies of human intelligence and educational attainment are not inevitably ushering in a new eugenics age.