How To Recognize Quality Research Regarding an Instructional Interventions
When reviewing educational research, there are several characteristics to look for in determining the quality of the work.
Was the study conducted by the publisher’s staff, or by an independent researcher from a recognized College or University? Not surprisingly, many publishers of intervention software show results on studies conducted internally. Independent investigators who work at leading institutions of higher learning across the country have conducted research on New Century’s products and found statistically significant results.
Did the research design involve a control group who did not receive the intervention at all or who at least were treated with lower amounts of the intervention? Some publishers offer research that simply states a school or district conducted an intervention, and scores rose. However, without a control group, it is not clear whether scores rose because of the intervention or because of one or many confounding variables. In schools, as in other field research settings, there are often many changes occurring in staff, instruction, environment (consolidation of schools), and in the scheduling and organization of students. A control group allows for the inference of causation, i.e., that the intervention leads to the scores improving. All studies of New Century products include control groups.
Were students randomly assigned to control and experimental conditions? The gold standard in research is a “true experiment,” including not only a control group but random assignment of students between experimental and control groups. True experiments can be difficult to organize at schools. New Century is one of the few publishers whose products are the subject of randomized/controlled studies that demonstrate statistically significant differences between students in experimental and control conditions.
If students were not randomly assigned to condition, then were the students in the two conditions matched closely for demographic and educational comparability? If so, then such a study would qualify as “quasi-experimental.” In quasi-experiments, even if conditions are not randomly assigned, the groups of students are often set up to have similar demographics (gender, ethnic composition, and participation in free and reduced lunch programs). In addition, they are often set up to have comparable prior year scores. Many studies try to compare the performance of dissimilar student groups. New Century products have been the subject of quasi-scientific studies wherein the independent investigator has selected the students based on careful analysis of demographic and educational backgrounds.
Consistent Instructional Time Across Groups
Do the students in the experimental group receive the same amount of instruction in the discipline as the students in the control group? Often, one will find that the students in the experimental group have additional instructional time (an extra class period or two per week) using the intervention, whereas the control group did not have such extra time using traditional instruction. In these cases, the study design does not address whether students achieved better performance as a result of the intervention, but whether students with more instructional time outperform those with less. However, the question remains whether students in the control group, if provided the same amount of additional instruction (without the intervention), would have outperformed the students in the experimental group.
Actual Usage of the Intervention
Does the study record the amount of time on task using the intervention? Simply because students are supposed to be using an intervention does not mean that they are actually doing so. In fact, other things may be going on in the classroom in lieu of use of the intervention. If the intervention has a means of measuring the actual minutes of time on task or number of lessons taken, then investigators can calculate the relationship between actual usage and gains on assessment tests. Studies of New Century products all involve the measurement of actual time on task.
Independent Assessment Tests Consistently Administered
Was an independent assessment test used as the dependant variable, or did the study use a measurement tool developed by the same publisher as the intervention or a special assessment test developed just for the study? Frequently, an intervention can demonstrate results when the same publisher’s assessments are used, but not when an independent assessment test is used. The studies of New Century products all involve independent assessment tests, typically the state proficiency test and often a second independent assessment tool. In addition, independent investigators typically take care to assure that both the pre-test and the post-test are administered to the control and experimental groups at the same time.
Far Tests vs. Near Tests
Were students assessed over a short period of time (Near Tests), or over a long period of time (Far Tests)? In some cases, experimental students who are tested before and immediately after a short, intense period of intervention, e.g. eight weeks, will show an improvement in performance over control students. However, such differences will disappear when measured using Far Tests, like the state proficiency test, which is typically administered once every 12 months. Independent studies involving New Century products almost all involve Far Tests.
Large Sample Sizes and Statistical Significance
Is the sample size of students in the study large enough to yield meaningful results? Too often, educational studies are based on 20 or 30 students. Often such small sample sizes do not hold enough weight to achieve statistical significance, though many studies try to make claims based on such small samples. The studies of New Century products generally involve large enough samples, often hundreds of students, to report statistically significant and meaningful results. If p>.05, the results are not traditionally considered statistically significant and any differences in student performance can be considered attributable to chance alone, and not to the intervention being studied.
Magnitude of Difference
If the results were significant, was the size of the effect of the intervention a large one? In many studies, differences in student performance may be small. A Cohen’s D (measurement of effect size) of .1 (d=.1) is considered small. Most independent studies of New Century products demonstrate an effect size in the .3 and .4 range, which is considered large for educational studies.