Recently I read the internal report, “Gifted and Talented Program Review,” posted on the WW-P district website. One thing that caught my eye is that this report recommends to “Eliminate the A&E identified program in grades 4 & 5” because “Data collected shows no statistically significant difference in students starting the program in grades 4 & 5 versus students who start the program in middle school.”
As a researcher and engineer whose work involves analyzing statistical results, I found that this report has, unfortunately, misinterpolated the statistics and drawn an irrelevant conclusion.
Just some background about the A&E mathematics program: it is designed to meet the needs of those students who have talents in mathematics. Currently the program selects students through exams, starting from fourth grade. Though students have the chance to enter program each year, the majority of them start from fourth or fifth grade. Only a few (as far as I know, less than 10 percent of the total A&E students) enter the A&E program later.
Here comes the problem in statistics: when talking about the statistical group differences, you want to make sure that the two groups have similar sample sizes. If one group is dominantly larger than the other group, which is the case here: 90 percent vs. less than 10 percent, the conclusion does not really make sense due to the unbalanced data size.
Thinking deeper, the two groups are not independently selected. Some students may have multiple trials before entering the program later, and they could face more competition on less vacancy in the program if re-trying later. Scientifically, such sequential sampling cannot lead to the conclusion stated in the report.
Actually, the similar performances found among those students entering A&E in different years only demonstrate one thing: the current A&E selection criteria is consistent over years. This is exactly what a good selection process should do!
If delaying the selection until sixth grades, as recommended, you will find that those selected in seventh and eighth grades will have similar performances to those selected in sixth grade. Based on the same logic, would you delay the selection process to infinity? Now you see what a ridiculous conclusion this wrong study can lead to.
Of course, we can, and should, debate about how to improve A&E. But we should not just conveniently draw any conclusions based on a wrong study.
Peng Wang