What does it take to prove that a new technology is better than current best teaching practices?
Before heading to the TopHat text, there is a recent review of the 50 year history of the “true experiment” in use in Education. In the 60’s there were 20 years there where we didn’t even THINK about doing a randomized trial in schools– impossibly they said. But since about 2011, we have once again moved back toward using the randomized trials method in educational psychology, but only for “study ready” interventions who have plenty of evidence that the developed program “COULD” be effective. Only then is it worth breaking out the “big guns” of doing a randomized trial. Read this 11-page overview of the history and where we are today. Styles and Torgerson 2018
Read Chapter 9 in the TopHat Text.
Chapter 9 gives the background for conducting comparisons between a class using a new technology, and a non-tech class that provides a comparison: 2 different classes, perhaps with the same teacher, but entirely different students. One thing seems clear, research in applied settings like real classrooms rarely has a “true” Control group. Thus, it’s best to refer to it as a “COMPARISON” class rather than a control, which is reserved for a group that is identical in every way except the drug or other intervention being studies as the independent variable. As the researcher, your task is to do you best to create the logic of the experiment (the method) to control any possible confounding factors that could also explain any difference you find, other than your treatment/intervention.
We’re not assigning Chapter 15 on Inferential statistics, but eventually the researcher has to compare the performance of that New Tech class with the Comparison class. Of course no 2 classes are likely to perform identically, so the question is not “are the 2 classes different,” but rather “are the significantly more different than chance alone would expect?” In Module 13 we’ll discuss ways to reduce the 28 or so scores in one class to a single number that lets us compare that class to the other. Typically that number is the average score (or statistical Mean), but it could be other measures of central tendency, such as the Mode or Median score.
Get Going with the To-Do Activity! (there’s a ton of stuff to do this week… more doing, than reading)
I have pushed a ton of stuff into the To-Do Activity for this module. You will conduct a Power Analysis to determine the preferred sample size, conduct a t-test analysis, not to mention several other related activities done as part of a typical experimental method. This is to give you some idea of what might be involved in conducting a careful experimental design study in your classroom.
Key issues consider:
What random selection and random assignment to groups “buys you” as a researcher and what is lost without it.
How designing and trying an intervention is so much better than looking at, and interpreting, correlations.
Getting control of as many confounding variables (alternative simple explanations) is the process of approximating the “true experiment” and moves us in to quasi-experimental methods. What really constitutes a “CONTROL” group and can it be achieved in an applied classroom setting?
Determining how many participants you need to study in order to be sure your analysis can detect a tiny or large difference– the sample size is estimated using a power analysis tool, and requires you to make a “guess” about how much impact your technology intervention may have (as measured by the “effect size.”
Is my study a failure if I don’t find what I expected? Hardly! Consider this work by the Bing development team (Kohavi et al. 2012) describing their program of experiments to select how search results should look or work, to maximize client profits.
Think about what it means to commit (preregister) to your hypothesis, in writing, on the internet, before beginning your experiment at something like the Center for Open Science. This avoids “shopping for effects” once your data are collected.
Finally, if you want to dig through the process we have discussed about education reporters and bloggers discussing research studies and few people reading the full texts of the study, here is one good example. The topic is Restorative Justice and its value in school disciplinary policy. You can start with how it was reported by the educational media with this short article by Max Eden at the Fordham Institute. Next you can look at the Rand synopsis of the full text research study. Your final alternative would be to read the full text study online or Read the full 132 page RAND 2019 Pitt Restorative Justice pdf here in HuskyCT. This “levels” of reporting and the way learning science and educational research is crafted for peer review all contribute to how useful scholarly research is for forming school policies and making classroom tec
Essential Understanding
From this module, you should get some idea of the effort and thought processes involved in designing a close approximation to the “true experiment” in a clinical setting like a real classroom. Things taken for granted in most lab research, such as random selection from the population you want to study, and random assignment to condition (treatment or class), can be nearly impossible to achieve in an applied research setting. But these things are very big in terms of simplifying the design and analysis. And without them, the researcher is left to complex statistical methods or statistical simulations (like Monte Carlo techniques) to “make up for” the lack of experimental control over the study.