Code Acts in Education: Polygenic Scores as Political Technologies in Educational Genomics
Polygenic scores are summary statistics invented in biomedical genetics research to estimate a person’s risk of developing a disease or medical condition, and are often envisaged as the basis for “personalized” or “stratified medicine”. In recent years, social and behavioural genetics researchers have begun suggesting polygenic scores could be used in education too, raising significant concerns along scientific, ethical and political lines.
The publication in June 2024 of a research article titled “Exploring the genetic prediction of academic underachievement and overachievement” shows that polygenic scoring remains a popular methodology in studies of genetics and education. Its authors argue that school achievement can be “genomically predicted” using “genome-wide polygenic scores”. The paper is part of a long-running series of studies by a team mostly associated with the Twins Early Development Study (TEDS, established in 1994 as a longitudinal study of around 15,000 pairs of twins in the UK). Over the past decade, the team has increasingly used polygenic scores (as an earlier paper is titled) for “Predicting educational achievement from DNA”.
In this post I approach polygenic scores for predicting educational achievement as technologies with political properties. Part of our ongoing BioEduDataSci research project funded by the Leverhulme Trust, it follows up from a previous post outlining how “educational genomics” research may be paving the way for the use of genetic samples in educational policy and practice, and another highlighting the severe ethical problems and scientific controversies associated with educational genomics.1 Here I use the new predictive genetic achievement paper to foreground some of the political implications of educational genomics.
Biomarker methodologies
Understanding the political aspects of polygenic scores2 requires some engagement with their construction as methodological technologies. Polygenic scores are artefacts of a complex technoscientific infrastructure of statistical genetics, molecular genomic databases, bioinformatics instruments, analytics algorithms, and the institutions that orchestrate them, which together function to make valued social outcomes—such as educational outcomes—appear legible at a molecular level of analysis.
To construct polygenic scores, researchers require genotyped DNA data, which they then analyze through genome-wide association study methods and technologies. These identify minute genetic differences—genetic biomarkers known as single nucleotide polymorphisms, or SNPs—that are associated with a phenotype (an observable behaviour, trait, or social outcome). One aim of such studies is to identify the “genetic architecture” of a trait or outcome–such as the genetic architecture of educationa attainment.
The SNPs associated with the phenotype can then also be added up into a genetic composite known as a polygenic score. In education, the most common polygenic scores are for educational attainment (years of school), said to predict around 11% of the variance. Individuals can ultimately be ranked on a scale of the genetic probability of success at school.
The use of polygenic scores and associated methods and measures represents the data-centric “biomarkerization” of education, where biological signals are taken as objective evidence of the embodied substrates of academic attainment and achievement. This has only become possible with the development of an infrastructure of biobanks of genetic information and bioinformatics technologies, which can be used to generate and analyze genetic data for markers associated with educational outcomes.
In the latest genetic prediction of academic achievement paper, for example, the authors claim a “DNA revolution has made it possible to predict individual differences in educational achievement from DNA rather than from measures of ability or previous achievement”.3 Their basic claim is that technologies to calculate polygenic scores can operate as “early warning systems” to predict school achievement from infancy. The latest study design used TEDS data collected from children at age 7 to construct polygenic scores, based on a previous study of the educational attainment of a 3 million sample (which I’ve discussed before).
The paper introduces “the concept of genomically predicted achievement delta (GPAΔ), which reflects the difference between children’s observed academic achievement and their expected achievement”, where the former are standardized test achievements and the latter are polygenic predictions. So, the methodological invention of the paper—the measure of genomically predicted achievement—is ultimately a way of comparing a child’s observed academic achievement, as assessed by school test results, with a polygenic score predicting “genomically expected achievement” from DNA samples collected in childhood.
Biosocial sorting
These conceptual inventions and large stats certainly lend the study the quality of digital objectivity. But the critical point here is that the polygenic scores used in the study, and the genomically predicted achievement measures, are the results of social, technical and scientific practices, each of which can affect the results. As Callie Burt has noted in a detailed critical examination of how polygenic scores are made (and their limitations), there are multiple ways to create polygenic scores, each involving different assumptions and goals, measurement instruments, technical adjustments, calculation methods, and analysis specifications, which can introduce further technical biases.
Detailed analysis of the shortcomings of the methodology and findings of the genomic achievement study were posted on twitter by statistical geneticist Sasha Gusev, questioning its causal claims and predictive accuracy. He also showed how methodological choices and limitations in the research (particularly insufficient acknowledgement of social factors) meant that the “underachievers” it identified were actually individuals with high socioeconomic status and high early years achievement, who subsequently underperform at school.4
The study risks labelling and lowering expectations of “underachievers” as having lower education-related “genetic propensity” (as the TEDS team terms it) for achievement, while also privileging well-off kids by directing additional resources their way. And as Gusev points out, any allocation of resources from the study findings would therefore be targeted at “students from high-SES/high-edu backgrounds, while telling ‘overachievers’ (poor kids with good grades) that they’re swimming upstream”, seemingly against the genetic currents determining their achievement prospects.
The implication, then, is that polygenic methods could be used to classify children into groups defined and labelled in terms of genomically predicted levels of achievement. This would amount to a strategy of biosocial classification of children. By biosocial classification is meant the categorization of social groupings as defined by biological measures. In this case, it means sorting children into polygenic biosocial categories through the analysis of SNP biomarkers corresponding with school achievement, in ways that appear to reproduce and reinforce socioeconomic categories and biases.
What this indicates, then, is that despite the seeming objectivity imputed to genomic technologies, polygenic scores and associated measures remain methodologically problematic and potentially skewed in their results. Such studies can harden social biases and inequalities even as major claims are made that they could inform decisions about the just allocation of resources in schools.
Promissory politics
Beyond its biosocial sorting, this kind of polygenic scoring project can also exert other kinds of political effects. The political allure of genetic objectivity and biological authority in polygenic scoring studies appears to be growing, supported by promissory claims of the future potential of genomic technologies to further reveal genetic insights at even larger scale.
As already noted, one political implication of educational genomics research is that the results—predictions of educational outcomes from DNA—could be used as the basis for political interventions targeting children genomically predicted as at risk of underachievement. As discussed elsewhere, some authors of the study were involved in a report for the Early Intervention Foundation (a UK government “what works” centre), which made the case for genetic “screen and intervene” programs in the early years.
The collection of TEDS data from 7 year-olds in the 1990s has given these researchers tremendous bioinformational advantage to make claims to policy relevance. A main claim of latest genetic achievement paper is that “screening for GPAΔ could eventually be a valuable early warning system, helping educators identify underachievers early in development”. From such genetic early warning signals, it seems, should flow early interventions “targeting students underachieving genomically”.
The seeming relevance of this work to policy and practice needs to be understood as deriving from political interest in the potential and promise of data-driven science, supported by the development of genomics technologies by major biotech firms. The methods section of the genomically predicted achievement paper, for example, details how “DNA for 12,500 individuals in the TEDS sample was extracted from saliva and buccal cheek swab samples and hybridized to one of two SNP microarrays (Affymetrix GeneChip 6.0 or Illumina HumanOmniExpressExome chips)”. It also involved use of the application LDPred2 to “compute GPS for all genotyped participants”, and “training” a “model to maximize prediction”.
This existing apparatus of technologies, however, is presented as just the first step necessary to fully compute genomically expected achievement across the whole population of children, which will only become possible with increased DNA data.
GPAΔ seems impractical now because it requires DNA, genotyping, and the creation of GPS. However, the rise in direct-to-consumer DNA testing suggests a future where GPAΔ becomes more accessible. At least 27 million people have paid direct-to-consumer DNA testing companies for this service, and these companies are increasingly marketing their product to encourage parents to test their children. … Once genotyping is available by whatever means, it will be possible to create GPS for educationally relevant traits, a process that is becoming routinized.
Educational genomics articles like this one routinely invoke promissory claims of future potential, once the existing infrastructure of mass biodata storage, genotyping platforms and polygenic scoring software has been sufficiently upgraded. As this excerpt indicates, the biological authority of educational genomics depends to a significant degree on biotech firms and consumer genetics companies.
It is this promissory quality associated with technological advances that enables researchers involved in educational genomics studies to claim moral and political authority to not only understand but to improve social institutions like schooling—and likewise to criticize forms of social science and policy that do not incorporate genetic measures as ideologically irresponsible.
In other words, genomic technologies are invoked to support the political project of advancing the power and authority genetic sciences in social policy areas like education. A recent report by the UK Government Office for Science, for example, asked “What could genomics mean for wider government?” It highlighted how existing medical infrastructures of medical genomics could be capitalized on for other social policy areas, and proposed education as one key area of potential application.
Educational genomics studies, enabled by new genetic technologies, therefore support visions of future policy possibilities. The idea is that genetic testing and screening could become policy technologies, if only the necessary infrastructure upgrades are put in place.
Genoeconomic policy
The idea of genetic testing as a policy approach is obviously controversial, given the history of eugenic interventions in education. It does, however, appear to link neatly with current mainstream policy approaches. Critics have pointed out that educational genomics proposals often reinforce “technocratic” or “neoliberal” policy models that treat education as a kind of laboratory for boosting economic outcomes and social mobility, and which promise to reduce costs and save money for government agencies and taxpayers. Such promises may reduce the seeming controversy associated with the science by appealing to political expedience.
Along these lines, in the genomic achievement paper, the authors claim that “Targeting GPAΔ might also prove cost-effective because such interventions seem more likely to succeed by going with the genetic flow rather than swimming upstream, helping GPAΔ underachievers to reach their genetic potential”. Later in the paper, they add that the “findings suggest that GPAΔ can help identify underachievers in the early school years, with the rationale of maximizing their achievement by personalizing their education”.
So the policy relevance of the paper appears again to be “cost-effective” interventions in early school years, driven by the aim to increase individual achievement through “personalized” learning. Such proposals certainly look like biomedicalized neoliberal policy, where measurable individual achievement might be bumped up through the efficient genomically-targeted allocation of resources. The cost-saving argument for using genetic data for decision-making in education has also been made in the popular science book The Genetic Lottery.
As the opening sentence of the paper reads, “Underachievement in school is costly to society and to the children who fail to maximize their potential”—with a citation to a paper about the “economics of investing in disadvantaged children” by economist James Heckman. Heckman is well known for his work calculating the economic payoffs of investment in early years child development – the “economization of early life” as Zach Griffen describes it – which is central to the model of “human capital development” he promotes to policymakers.
Other papers by the same TEDS team and their collaborators invoke studies by the OECD similarly citing the importance of education to economic outcomes, in ways that appear to amount to a program of hunting for biological signals of human capital in the genome. Many other educational genomics studies are, in fact, led by economists—or self-described “genoeconomists”—who first latched on to the idea that genetic data about educational outcomes could be used to understand the genetic basis of other downstream socioeconomic outcomes. Ultimately, this work suggests political investments in genetic testing as an investment in economic outcomes, potentially diverting resources from other forms of intervention based on non-genetic analyses.
Educational genomics research and advocacy therefore suggests the emergence of genoeconometric education policy, buttressing and fortifying existing econometric tendencies in international education policy with seemingly objective data about the genetic substrates of outcomes. Whether there is genuine political (or public) appetite for this remains to be seen, but clearly the data and the proposals are being presented and circulated in ways that are intended to promote genoeconometric solutions—such as early years screen and intervene programs—to address the relationship between children’s outcomes, human capital development and economic prospects.
Biopolitical technologies
There are several reasons to question the assertion that genomic or genoeconometric education policy based on polygenic scores would be a good idea socially, politically or ethically. They include risks that the use of genetic information may lead to forms of biological reductionism, discrimination, stigmatization, racism, self-fulfilling prophecies, or distract from other forms of intervention.
Even if a genomic prediction of achievement outcomes can be made reliably, as the TEDS paper claims, it remains unclear exactly what causal biological mechanisms are associated with it. Although educational genomics research studies are increasingly high-powered in computational and data processing terms, they have very partial explanatory power and remain far from specifying the genetic mechanisms that underpin educational outcomes like achievement or attainment. Statistically speaking, the “genetic architecture” of educational outcomes may have become legible–as thousands of SNP associations–but the actual biology remains unknown.
Another major problem is the thorny issue of race and ethnicity in social and behavioural genetics research, and the eugenic legacy underpinning such science. As the TEDS authors themselves acknowledge, polygenic scores are affected by “cultural bias” because existing datasets over-represent healthy, white, well-educated, and wealthier than average individuals of European ancestry. Any intervention based on genomic data would necessarily exclude all other groups, since the data do not exist to support polygenic prediction beyond European population groups, and would therefore be politically untenable on equity grounds. The findings from such studies can also be appropriated to support racist assertions of biological superiority and inferiority in intelligence, or “function normatively to reinforce conceptions of race as an innate and immutable fact that produces racial inequalities”.
A final issue, for now, is that educational genomics studies persistently obscure the social and environmental factors that shape educational achievement, while overplaying the influence of genetic transmission. Even where social and environmental factors are considered, they may be simplified into reductive measures of socioeconomic status or family factors, rather than taking account of complex social and political structures, dynamics and their impacts. As in other studies of gene-environment interactions, social factors may even be “re-defined in terms of their molecular components”, shifting away “from efforts to understand social and environmental exposures outside the body, to quantifying their effects inside the body”.
Given these issues—unknown biology, non-representativeness, spectre of race science, and obscuring social factors—it is hard to see how the genomically predicted educational achievement findings could translate into genomically targeted educational interventions.
The study does, though, show how polygenic scores and associated genomic methods and measures can function as political technologies. They enable social and behavioural genomics scientists to claim objective, data-based biological authority, despite methodological limitations, while criticizing other forms of non-genetic investigation into the social determinants of school achievement as morally and ideologically irresponsible. The use of genomic technologies also supports particular kinds of political interventions that prioritize cost efficiency and achievement maximization according to economic “human capital” conceptions of educational purpose.
Polygenic scores support a biomarkerized model of schooling that centres the idea of genetic testing and predicting academic achievement in order to target interventions on genetic groupings of students to boost economic metrics, rather than alternative kinds of reform. They help support the solidification of economic models of schooling that have dominated education policy and politics for decades, albeit with a genetic twist that treats societal progress and human capital as embodied in the human genome.
Perhaps it is more accurate, therefore, to call polygenic scores “biopolitical” technologies–that is, techniques that enable knowledge about living processes to be produced and used as the basis for governing interventions. As biopolitical technologies used in educational genomics research, polygenic scores now support the production of knowledge about the genetic correlates of learning achievements and the potential biosocial sorting of children.
That genetic knowledge is now being promoted as the basis for proposing genetically-informed education policy interventions targeting children’s school achievement. But there remain many important reasons to question whether biopolitical technologies of early years mass genetic testing and screening should ever make the leap from the lab to school systems.
Notes
- To be clear “educational genomics” is not a unique scientific field, but our name for a body of research on the genetic underpinnings of educational outcomes–and gene-environment interactions–largely carried out by scientists in fields of behaviour genetics and social science genomics (sociogenomics). Different groups and individuals do not always agree about findings, and there is particular controversy among them about the policy relevance (or not) of such work. ↩︎
- Polygenic scores (PGS) are also sometimes referred to as genome-wide polygenic scores (GPS), polygenic risk scores (PRS), or more recently polygenic indices (PGI). Callie Burt critically discusses the recent proposal to term them PGIs, convincingly noting that ‘the shift to index potentially obscures the fact these are “rankings” (i.e., positions on a scale) of genetic associations with socially valued outcomes, whether we call them scores or indices’. ↩︎
- A distinction is often made between “prediction” in the biostatistical sense–that a genetic measure is strongly correlated with an outcome or trait–and prediction as a way of making forecasts about the future. In the study discussed here, and elsewhere, that distinction dissolves, and genetic prediction through polygenic scores becomes “fortune telling“. ↩︎
- Gusev has also written a thorough technical analysis of the heritability of educational attainment, where he argues that “Cultural transmission and environment is much more important than genetic transmission”, though this is often under-reported in published studies and particularly in press coverage. ↩︎
This blog post has been shared by permission from the author.
Readers wishing to comment on the content are encouraged to do so via the link to the original post.
Find the original post here:
The views expressed by the blogger are not necessarily those of NEPC.