President's Column: Rethinking the Role of the GRE
C. Urry Yale University
An Open Letter to Chairs of Departments That Grant Degrees in the Astronomical Sciences:
I am writing about an issue of concern to the American Astronomical Society (AAS), namely, graduate admissions. In January, the AAS Council will discuss and vote on whether to issue a statement on behalf of the Society (appended at the end of this letter) that makes a case for why the Graduate Record Exam (GRE) and the Physics GRE (PGRE) should be optional; or, if they are used, why there should be no fixed cutoff score; and why the demographics of the applicants may need to be taken into account explicitly. I write in advance of that action because the season of graduate admissions is upon us. I hope you will read this letter and draft statement and circulate it to your graduate admissions committee. If you have any comments or concerns, I hope you will send them to me and/or [email protected].
Many departments require applicants to report (P)GRE scores. However:
- Available data show only a weak correlation between (P)GRE test scores and success in graduate school or thereafter (e.g., the correlations between test score and long-term outcomes have linear correlation coefficients <0.2; see draft statement for details). According to a recent survey, the winners of our national prize postdoctoral fellowships (Hubble, Einstein, Sagan, Jansky, NSF) have been astronomers with PGRE scores spanning the entire range, consistent with the low correlation of these exams with long-term success. Non-cognitive skills, i.e., personal characteristics like determination, motivation, and "grit," appear to be at least as important as the (P)GRE. If so, it is important to incorporate non-cognitive skills in our admissions decision-making processes.
- The data also show disturbing and much stronger correlations of score with gender and race or ethnicity. Correcting (P)GRE scores for these dominant systematics allows the weak correlation noted above to emerge. Applying a fixed cutoff score for the (P)GRE dramatically lowers the fraction of women and students of color who are admitted to our graduate programs relative to the fraction of white and Asian men who are admitted. Can we justify this disparity if there is little association between score and successful completion of a PhD?
Our concern ought to be with what we want the test to measure, namely, likelihood of success long after admission, and we should strive for a standard consistent with that aim. If there are gender (or other) systematics in the correlation of test score and performance, then we need to take those systematics into account. We should think about using measures that are less affected by nuisance systematics, present less of a cost barrier to students, and are more predictive of outcomes.
The AAS Committee on the Status of Minorities in Astronomy, in consultation with the AAS Committee on the Status of Women in Astronomy and the AAS Committee for Sexual-Orientation and Gender Minorities in Astronomy, drafted the attached statement about the use of the (P)GRE exams in graduate admissions. We will be discussing this statement with the Council at the AAS meeting in Kissimmee, Florida, in January. I hope you will discuss these issues and this draft statement with your colleagues over the next few months, as you select next year's incoming class. It's a matter of great concern to all of us: failing to draw from the full pool of talent weakens our profession. It's vitally important to train leaders who will help our profession achieve true equity and inclusion, and thus the strongest possible astronomical community.
Thank you very much for thinking about this — I look forward to hearing your thoughts.
President, American Astronomical Society
*** DRAFT *** AAS Statement on Limiting the Use of GRE Scores in *** DRAFT ***
*** DRAFT *** Graduate Admissions in the Astronomical Sciences *** DRAFT ***
Each year, roughly 55,000 physical science majors take the Graduate Record Exam (GRE) and 5,000 take the Physics Subject Exam (PGRE). Both the GRE and PGRE are widely used in the astronomical community as a metric to rank graduate talent. Most US graduate programs in the astronomical sciences require the GRE and PGRE to evaluate applicants. In addition, GRE scores are required by several major fellowships (including the National Academies Ford Foundation Fellowship and the National Defense Science and Engineering Graduate Fellowship, and most NASA fellowships. They are also used to rank graduate programs by organizations such as US News and World Report and the National Research Council.
The evidence, however, suggests that GRE and PGRE scores are poor predictors of success in graduate study in the astronomical sciences. Glanz (1996) demonstrated that GRE scores are weakly correlated with average grades in graduate physics courses at Harvard University. Sternberg & Williams (1997) demonstrated that GRE scores fail to correlate with several key skills for graduate study, including analytical thinking, creativity, research acumen and teaching, and correlate only modestly with first-year grade point average. Preliminary research indicates similarly weak predictive power for the PGRE. To be clear, the predictive power of these exams is not zero; longitudinal meta-analytic studies do find statistically significant linear correlation coefficients at the 0.1-0.2 level between test scores and long-term outcomes such as citations and scholarly output decades later. However, these correlations emerge only through multivariate analyses that control for the more dominant correlations of test scores with demographic variables — systematics for which graduate admissions committees rarely correct quantitatively.
Indeed, because the tests have such strong systematics, the use of GRE and PGRE scores as a measure of potential success has well-documented and powerful effects on the demographics of the resulting graduate cohorts. Halley et al. (1991) showed that GRE performance correlates with whether the undergraduate institution has a graduate program, implicitly penalizing students from many liberal arts colleges. Research by the Education Testing Service (ETS), and more recently by Miller & Stassun (2014), demonstrate that GRE scores correlate with demographic characteristics unrelated to potential for graduate study, such as gender, race and socioeconomic status. These correlations persist even in the GRE's recently revised general test. These demographic correlations are a feature of standardized exams more generally (e.g., Helms 2009) and may well be the result of stereotype threat, the fear of confirming negative stereotypes about one's own group (Steele & Aronson 1995)., Miller & Stassun show that misusing GRE scores, particularly by establishing score thresholds, fuels the underrepresentation of white women and minorities in graduate programs. ETS itself states, "A cutoff score [on the GRE] should never be used as the only criterion for denial of admission or awarding of a fellowship."
A third issue with the GRE exam is its financial burden on test takers. Students currently pay $195 to take the GRE and $150 to take the PGRE, as well as $27 for each institution/fellowship they designate to receive an official score beyond an initial four. Considering that students often take these exams multiple times (particularly the PGRE) and apply to 5-10 graduate programs, these tests require a significant investment. While ETS has a Fee Reduction Program that covers 50% of exam costs, it applies to a single test and has stringent eligibility requirements. Fulfilling the GRE requirement is thus beyond the means of many students.
Based on this research, several physics and astronomy graduate programs and fellowships, most notably the NSF Graduate Research Fellowship Program (GRFP), have dropped the GRE and/or PGRE from their admissions or application requirements. The National Society of Hispanic Students (NSHP) recently called for a critical reevaluation of the use of the GRE as an admissions metric. Nevertheless, Miller (2013) found that 96% of physics programs retain them, and over half specify cutoffs. As an alternative, some programs have begun to incorporate measures of non-cognitive skills (e.g., structured interviews that specifically assess these skills) as less biased and much stronger predictors of potential for long-term success.
Recommendation: Given the research indicating that the GRE and PGRE are poor predictors of graduate student success, that their use in graduate admissions has a particularly negative impact on underrepresented groups, and that they represent a financial burden for many students in pursuing advanced degrees in the astronomical sciences, the AAS recommends that graduate programs eliminate or make optional the GRE and PGRE as metrics of evaluation for graduate applicants. If GRE or PGRE scores are used, the AAS recommends that admissions criteria account explicitly for the known systematics in scores as a function of gender, race, and socioeconomic status, and that cutoff scores not be used to eliminate candidates from admission, scholarships/fellowships, or financial support, in accordance with ETS recommendations.
 Levesque, E., Bezanson, R., Tremblay, G. (2015) "Physics GRE Scores of Prize Postdoctoral Fellows in Astronomy" to be posted to astro-ph on 11 December 2015.
 Glanz, J. (1996). How Not to Pick a Physicist? Science 274, 710
 Sternberg, R. & Williams, W. (1997). Does the Graduate Record Examination Predict Meaningful Success in the Graduate Training of Psychologists? American Psychologist 52, 630-641
 Miller, C. (2015), preliminary analysis presented at Inclusive Astronomy 2015, https://www.youtube.com/watch?v=96vJQCov8Do
 Halley, J. W. et al. (1991). The Graduate Record Examination as an indicator of learning of the curriculum taught to physics majors in US institutions. American Journal of Physics 59, 403
 Miller, C. & Stassun, K.G. (2014). A test that fails: A standard test for admission to graduate school misses potential winners, Nature Careers 510, 303
 Helms, J. E. (2009). Defense of tests prevents objective considerations of validity and fairness. American Psychologist 64, 283-284.
 Steele, C.M., & Aronson, J. Stereotype Threat and the Intellectual Test Performance of African Americans. Journal of Personality and Social Psychology 69, 797
 A great resource on stereotype threat is http://www.reducingstereotypethreat.org/
 See http://ainsleydiduca.com/grad-schools-dont-require-gre/#Sciences for a subset of these institutions.
 Stassun et al. (2011). "The Fisk-Vanderbilt Master's-to-Ph.D. Bridge Program: Recognizing, enlisting, and cultivating unrealized or unrecognized potential in underrepresented minority students," American Journal of Physics, 79, 374. See also http://fisk-vanderbilt-bridge.org/tool-kit/