Simpson’s Paradox and Changing Demographics
Suppose you work at a school that has the following profile for percentage of students at or above proficient over 3 years. Assume that the school only has those two student populations/subgroups.
| Year | Population 1 | Population 2 |
|---|---|---|
| 2006 | 80% | 20% |
| 2007 | 85% | 25% |
| 2008 | 90% | 30% |
Simply looking at the percentages, we might conclude that the school is producing steady gains in percentage of students proficient. On the surface it seems the school has its act together and might be a place we would send our kids.
Suppose then that you were given the totals.
| Year | Population 1 | Population 2 | Total |
|---|---|---|---|
| 2006 | 80% | 20% | 68% |
| 2007 | 85% | 25% | 61% |
| 2008 | 90% | 30% | 54% |
You might look at the table and wonder about the accuracy of the data. How could the percentage of students who are proficient be going down for the entire school if we’re making gains in all student populations?
Well this happens because of changing demographics and something known to statisticians as Simpson’s Paradox. Let’s take a closer look at the data. Suppose that the school enrolls 1000 students but is undergoing a change in its demographics.
| Population 1 | Population 2 | Total | |||||||
|---|---|---|---|---|---|---|---|---|---|
| YR | Prof | Stu | % | Prof | Stu | % | Prof | Stu | % |
| 2006 | 640 | 800 | 80% | 40 | 200 | 20% | 680 | 1000 | 68% |
| 2007 | 510 | 600 | 85% | 100 | 400 | 25% | 610 | 1000 | 61% |
| 2008 | 360 | 400 | 90% | 180 | 600 | 30% | 540 | 1000 | 54% |
Schools that experience changing demographics can produce results that are counter-intuitive. Your conclusions may well be determined by what part of the data is important to you. In this example, if we use the percentage of proficient students in each population as the measure of success then this school is very successful. If we use the aggregate data we will see a decline. The opposite can also be true. We can also have a school where the percentage of proficient students in each constituent subgroup decreases yet have an aggregate that increases (just run the data in reverse).
The bigger the gap in achievement between the two groups and the greater the change in demographics, the greater the possibility that the results are contradictory.
Will we have the patience to comb through the data to figure out what is really going on? What will be of teachers’ effort to save a school from being a failure if we produce improvements in each subgroup yet fail to deliver on the aggregate when we face shifting demographics? Will anyone have the patience to look at the data or will we just run through bullet points? Is an aggregate increase in proficient students enough to save our schools from being labeled a failure?
UPDATE (2010/03/20): I found this recently. The late Gerald Bracey has written about this topic much more eloquently and with actual data in THOSE MISLEADING SAT AND NAEP TRENDS: SIMPSON’S PARADOX AT WORK.
Recent Comments