International Test Score Comparisons and Educational Policy: A Review of the Critiques

Carnoy, Martin

International Test Score Comparisons and Educational Policy: A Review of the Critiques

Martin Carnoy

October 30, 2015

Publication Announcement

Stanford education professor Martin Carnoy examines four main critiques of how international tests results are used in policymaking. Of particular interest are critiques of the policy analyses published by the Program for International Student Assessment (PISA).

Using average PISA scores as a comparative measure of student achievement is misleading for a number of reasons, Carnoy maintains:

Students in different countries have different levels of family academic resources;
The larger gains reported on the TIMSS, which is adjusted for different levels of family academic resources, raise questions about the validity of the PISA results when used for international comparisons.
PISA test score error terms are “considerably larger” than the testing agencies acknowledge, making the country rankings unstable.
The Shanghai educational system is held up as a model for the rest of the world on the basis of non-representative data.

Of further concern is the conflict of interest arising from the Organization for Economic Cooperation and Development (which administers the PISA) and its member governments acting as a testing agency while simultaneously serving as data analyst and interpreter of results for policy purposes.

Carnoy considers the critiques within a discussion of the underlying social meaning and education policy value of international comparisons in general. He describes why using average national math scores as predictors of future economic growth is problematic, and points out that using scoring data in this manner has limited use for establishing education policy because causal inferences can not be meaningfully drawn.

Finally, Carnoy explores the relevance of nation-level test score comparisons among countries such as the United States with diverse and complex education systems. The differences between states in the U.S. are, for example, so large that employing U.S. state-level test results over time to examine the impact of education policies would be more useful and interesting than using combined U.S. data.

Despite valid critiques of international test result comparisons, Carnoy argues that the comparisons will neither go away nor stop being inappropriately used to shape educational policy. He concludes with five policy recommendations to reduce the misuse of testing data.

Policy Brief

Expand

Download

Print