Hake'sEdStuff: randomized control trials

Showing posts with label randomized control trials. Show all posts

Sunday, December 9, 2012

Grover Whitehurst Testifies Against Class Size Reduction

Some blog followers might be interested in a recent post “Grover Whitehurst Testifies Against Class Size Reduction” [Hake (2012)]. The abstract reads:

*********************************************
ABSTRACT: Diane Ravitch in her blog entry “When Grover Whitehurst Testified Against Class Size Reduction” at http://bit.ly/QRniWu pointed to Leonie Haimson’s http://huff.to/12fHNyy “Grover Whitehurst's big pay day, testifying class size doesn't matter” at http://bit.ly/Vsp2T2 and asked: “DOES CLASS SIZE MATTER? READ HAIMSON'S ACCOUNT AND REACH YOUR OWN JUDGMENT.”

Haimson pointed out that:

a. According to report by Will Weissert (2012a) at http://yhoo.it/VxrBCQ, economist Diane Whitmore Schanzenbach http://bit.ly/TVN6hv testified that students in smaller classes “tend to do better on standardized tests and even eventually become better citizens, more likely to own their own homes and save for retirement” and that “study after study shows that smaller classes often mean greater success for students.” Schanzenbach also coathored: (1) “Experimental Evidence on the Effect of Childhood Investments on Postsecondary Attainment and Degree Completion” [Dynarski et al. (2011)] at http://bit.ly/YRF0h7 showing that smaller classes increased the rate of college attendance, especially among poor students, and improved the probability of earning a college degree, especially in high-earning fields such as science, technology, engineering, and mathematics; and (2) “How Does Your Kindergarten Classroom Affect Your Earnings? Evidence From Project Star” [Chetty et al. (2011)] at http://bit.ly/U7oJNn showing that these students were also more likely to own their own home and a 401K more than twenty years later.

b. Whitehurst & Chingos (2011) wrote “Class Size: What Research Says and What it Means for State Policy” [Chingos & Whitehurst (2011)] at http://bit.ly/VXroeA which argued that LOWERING CLASS SIZE WAS A WASTE OF MONEY despite admitting in the report that “very large class-size reductions, on the order of magnitude of 7-10 fewer students per class, can have significant long-term effects on student achievement and other meaningful outcomes.”

c. When Whitehurst was at the US Department of Education from 2002-2008, he headed the Institute of Education Sciences http://ies.ed.gov/, which in a report Identifying and Implementing Educational Practices Supported by Rigorous Evidence: A User Friendly Guide [USDE (2003)] at http://bit.ly/ds1sRS cited CLASS SIZE REDUCTION AS ONLY ONE OF FOUR EXAMPLES OF EDUCATION REFORMS "FOUND TO BE EFFECTIVE in randomized controlled trials – research’s 'gold standard."[Yet] as the lead off witness for the state on Friday Whitehurst argued that contrary to the claims of the plaintiffs, “Texas is doing pretty good” and that these huge budget cuts were immaterial because CLASS SIZE DOESN'T MATTER.

d. Terrence Stutz (2012) at http://bit.ly/VXwA2l reported in the Dallas Morning News: “State attorneys also have been arguing that larger class sizes in Texas - the result of a $5.4 billion funding cut by the Legislature last year - have not hurt students because CLASS SIZES DON'T AFFECT ACHIEVEMENT. Whitehurst testified in support of that position. But again, under cross examination by Dallas lawyer John Turner, Whitehurst had to acknowledge that he wrote an article praising a well-publicized study of lower class sizes in Tennessee that found significant improvement in student achievement. Whitehurst explained that he had changed his mind since writing the article and now has DOUBTS THAT CLASS SIZE HAS MUCH IMPACT ON LEARNING. In later testimony, he said he was being paid $340 an hour by the state to testify in the case, and had already racked up 220 billable hours - for just under $75,000 - before he took the witness stand."

e. Whitehurst racked up 220 billable hours? That means Whitehurst must have worked nearly thirty 8-hour days on it. Wonder what took him so long?
*********************************************

To access the complete 23 kB post please click on http://bit.ly/Ufro6y.

Richard Hake, Emeritus Professor of Physics, Indiana University
Links to Articles: http://bit.ly/a6M5y0
Links to Socratic Dialogue Inducing (SDI) Labs: http://bit.ly/9nGd3M
Academia: http://bit.ly/a8ixxm
Blog: http://bit.ly/9yGsXh
GooglePlus: http://bit.ly/KwZ6mE
Twitter: http://bit.ly/juvd52

“Physics educators have led the way in developing and using objective tests to compare student learning gains in different types of courses, and chemists, biologists, and others are now developing similar instruments. These tests provide convincing evidence that students assimilate new knowledge more effectively in courses including active, inquiry-based, and collaborative learning, assisted by information technology, than in traditional courses.”
- Wood & Gentile (2003)

“In science education, there is almost nothing of proven efficacy.”
- Grover Whitehurst, as quoted by Sharon Begley (2004)

“Well-designed and implemented randomized controlled trials are considered the ‘gold standard’ for evaluaing an intervention's effectiveness, in fields such as medicine, welfare and employment policy, and psychology.”
- USDE (2003)

“Scientifically rigorous studies - particularly, the ‘gold standard’ of Randomized Controlled Trials (RCT’s) - are a mainstay of medicine, providing conclusive evidence of effectiveness for most major medical advances in recent history. In social spending, by contrast, such studies have only a toehold. Where they have been used, however, they have demonstrated the same ability to produce important, credible evidence about what works - and illuminated a path to major progress.”
- Jon Barron (2012)

“In some quarters, particularly medical ones, the randomized experiment is considered the causal 'gold standard.' It is clearly not that in educational contexts, given the difficulties with implementing and maintaining randomly created groups, with the sometimes incomplete implementation of treatment particulars, with the borrowing of some treatment particulars by control group units, and with the limitations to external validity that often follow from how the random assignment is achieved.”
- Thomas Cook and Monique Payne in Evidence Matters [Mosteller &
Boruch (2002)]

“According to the California Class Size Reduction Research Consortium [CCSRRC (2002)], California's attempt to duplicate the vaunted Tennessee RCT study of reduced class size benefits results yielded no conclusive evidence of increased student achievement. One reason appears to be that there were simply not enough teachers in California to support any substantive class size reduction without deterioration of teaching effectiveness.”
- R.R. Hake (2009)

REFERENCES [URL’s shortened by http://bit.ly/ and accessed on 10 Dec 2012.]
Baron, J. 2012. “Applying Evidence to Social Programs.” New York Times, 29 Nov; online at http://nyti.ms/Um9vVI.

Begley, S. 2004. “To Improve Education, We Need Clinical Trials To Show What Works,” Wall Street Journal, 17 December, page B1; online as a 41 kB pdf at http://bit.ly/SSmaym, thanks to David Klahr.

CCSRRC. 2002. “What We Have Learned About Class Size Reduction in California,” California Class Size Reduction Research Consortium [American Institutes for Research (AIR), RAND, Policy Analysis for California Education (PACE), WestEd, and EdSource]; full report online as a 9.5 MB pdf at http://bit.ly/YRD5ZS. A press release is online at http://bit.ly/V923Ms.

Hake, R.R. 2009. “A Response to ‘It's Not All About Class Size’,” AERA-L post of 6 Feb 2009 09:42:04-0800; online on the OPEN! AERA-L archives at http://bit.ly/KBzuXV. Post of 6 Feb 2009 09:42:04-0800 to AERA-L and Net-Gold. The abstract and link to the complete post were also distributed to various discussion lists.

Hake, R.R. 2012. “Grover Whitehurst Testifies Against Class Size Reduction” online on the OPEN! online on the OPEN! AERA-L archives at http://bit.ly/VYtD1l. Post of 9 Dec 2012 18:34:56-0800 to AERA-L and Net-Gold. The abstract and link to the complete post are being transmitted to several discussion lists.

Mosteller, F. & R. Boruch, eds. 2002. Evidence Matters: Randomized Trials in Education Research. Brookings Institution, publisher's information at http://bit.ly/UoX3sA. Amazon.com information at http://amzn.to/n6T0Uo. An expurgated Google book preview is online at http://bit.ly/RX1k3u.

USDE. 2003. U.S. Department of Education, Identifying and Implementing Educational Practices Supported by Rigorous Evidence: A User Friendly Guide. Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, online as a 140 kB pdf at http://bit.ly/ds1sRS.

Wood, W.B., & J.M. Gentile. 2003. “Teaching in a Research Context,” Science 302: 1510; 28 November; online as a 213 kB pdf at http://bit.ly/SyhOvL thanks to Portland State's “Ecoplexity” site.

Tuesday, December 4, 2012

Randomized Control Trials - The Tarnished Gold Standard

Some blog followers might be interested in a recent post “Randomized Control Trials - The Tarnished Gold Standard” [Hake (2012a)]. The abstract reads:

************************************************
ABSTRACT: In response to “The Randomistas' War On Global Poverty - Erratum & Addendum” at http://bit.ly/YfMESg, Art Burke of the EvalTalk list pointed to an NYT piece “Applying Evidence to Social Programs” by Jon Baron at http://nyti.ms/Um9vVI.

Baron wrote (slightly edited): “Scientifically rigorous studies - particularly, the 'gold standard' of Randomized Controlled Trials (RCT’s) - are a mainstay of medicine, providing conclusive evidence of effectiveness for most major medical advances in recent history. In social spending, by contrast, such studies have only a toehold. Where they have been used, however, they have demonstrated the same ability to produce important, credible evidence about what works - and illuminated a path to major progress.”

In this post I cite arguments that the “gold standard” RCT studies may not be as lustrous as claimed by Baron:

(1) Ever since the pioneering work of Halloun & Hestenes (1985a) at http://bit.ly/fDdJHm, physicists have been engaged in social science of Physics Education Research (PER) that has made useful, reliable, and nonobvious predictions without resort to RCT’s - e.g. “Why Not Try a Scientific Approach to Science Education?” [Wieman (2007)] at http://bit.ly/anTMfF.

(2) In “A Response to ‘It's Not All About Class Size’ ” [Hake (2009)], I pointed out that according to the California Class Size Reduction Research Consortium [CCSRRC (2002)] at http://bit.ly/V923Ms, California's attempt to duplicate the vaunted Tennessee RCT study of reduced class size benefits results yielded no conclusive evidence of increased student achievement.

(3) In “A Summative Evaluation of RCT Methodology: & An Alternative Approach to Causal Research” [Scriven (2008] at http://bit.ly/93VcWD wrote: “In standard scientific usage, experiments are just carefully constrained explorations, and the RCT is simply a special case of these. To call the RCT the only ‘true experiment’ is part of an attempt at redefinition that distorts the original and continuing usage, and excludes experiments designed to test many simple hypotheses about - or simple efforts to find out - what happens if we do this.”

(4) In “Seventeen Statements by Gold-Standard Skeptics #2” [Hake (2010)] at http://bit.ly/TNpTR9 I cite, among others, the comments of the American Education Research Association; the American Evaluation Association; the National Education Association; the European Evaluation Society; Thomas Cook and Monique Payne, Hugh Burkhardt & Alan Schoenfeld; Margaret Eisenhart & Lisa Towne; Burke Johnson; Annette Lareau & Pamela Barnhouse; Joseph Maxwell; Dennis Phillips; Barbara Schneider, Martin Carnoy, Jeremy Kilpatrick, William Schmidt, & Richard Shavelson; Mack Shelley, Larry Yore, & Brian Hand; Deborah Stipek; and Carol Weiss.

(5) In "Why Most Published Research Findings Are False," John Ioannidis' (2005 at http://1.usa.gov/YxUxkL states ". . . . there is strong evidence that selective outcome reporting, with manipulation of the outcomes and analyses reported, is a common problem even for randomized trails [Chan et al. (2004)] at http://1.usa.gov/X8SB1T.

(6) the present signature quote of Thomas Cook and Monique Payne.
************************************************

To access the complete 18 kB post please click on http://bit.ly/VzVc0K.

Richard Hake, Emeritus Professor of Physics, Indiana University
Links to Articles: http://bit.ly/a6M5y0
Links to Socratic Dialogue Inducing (SDI) Labs: http://bit.ly/9nGd3M
Academia: http://bit.ly/a8ixxm
Blog: http://bit.ly/9yGsXh
GooglePlus: http://bit.ly/KwZ6mE
Twitter: http://bit.ly/juvd52

"In some quarters, particularly medical ones, the randomized experiment is considered the causal 'gold standard.' It is clearly not that in educational contexts, given the difficulties with implementing and maintaining randomly created groups, with the sometimes incomplete implementation of treatment particulars, with the borrowing of some treatment particulars by control group units, and with the limitations to external validity that often follow from how the random assignment is achieved."
- Thomas Cook and Monique Payne in "Evidence Matters" [Mosteller & Boruch (2002)]

REFERENCES [URL shortened by http://bit.ly/ and accessed on 04 Dec 2012.]
Hake, R.R. 2012a. "Randomized Control Trials - The Tarnished Gold Standard" online on the OPEN! AERA-L archives at http://bit.ly/VzVc0K. Post of 4 Dec 2012 19:26:48-0800 to AERA-L and Net-Gold. The abstract and link to the complete post are being transmitted to several discussion lists.

Hake, R.R. 2012b. “The Randomistas' War On Global Poverty - ERRATUM & ADDENDUM,” online on the OPEN! AERA-L archives at http://bit.ly/YfMESg. Post of 30 Nov 2012 12:15:33-0800 to AERA-L and Net-Gold. The abstract and link to the complete post are being transmitted to several discussion lists, and are also on my blog “Hake'sEdStuff” at http://bit.ly/11c5w3e with a provision for comments.

Mosteller, F. & R. Boruch, eds. 2002. Evidence Matters: Randomized Trials in Education Research. Brookings Institution, publisher's information at http://bit.ly/UoX3sA. Amazon.com information at http://amzn.to/n6T0Uo. An expurgated Google book preview is online at http://bit.ly/RX1k3u.

Friday, November 30, 2012

The Randomistas' War On Global Poverty - ERRATUM & ADDENDUM

Some blog followers might be interested in a recent post “The Randomistas’ War On Global Poverty - ERRATUM & ADDENDUM” [Hake (2012a)]. The abstract reads:

*******************************************************
ABSTRACT: My post “The Randomistas' War On Global Poverty” [Hake (2012b)] at http://bit.ly/V93tXl:

I. Contains an ERRATUM: I wrote: “. . . . . there's no indication that the ‘Center for Economic and Policy Research’ (CEPR) http://bit.ly/U1DTSU, FOR WHICH DUFLO IS A PROGRAM DIRECTOR FOR DEVELOPMENTAL ECONOMICS http://bit.ly/VgQ3y8, is aware of the overriding influence of poverty on the educational achievement of U.S. children. . . .” Contrary to that statement, Duflo is a program director for the EUROPEAN “Centre for Economic Policy Research” http://www.cepr.org/ & http://bit.ly/TwYuzM, NOT the U.S. “Center for Economic and Policy Research” (CEPR) http://www.cepr.net/ & http://bit.ly/U1DTSU. So Duflo bares ZERO responsibility for the U.S. CEPR’s seeming unawareness of “the overriding influence of poverty on the educational achievement of U.S. children.”

II. Now requires an ADDENDUM: In response to Hake (2012), Guy Brandenberg (2012) of the EDDRA2 list pointed to a PLoS Medicine article by John Ioannidis http://bit.ly/Vb1u70 titled “Why Most Published Research Findings Are False” at http://1.usa.gov/YxUxkL, wherein Ioannidis wrote: “. . . . there is strong evidence that selective outcome reporting, with manipulation of the outcomes and analyses reported, is a common problem even for randomized trails [Chan et al. (2004)] at http://1.usa.gov/X8SB1T.” For a good discussion of the important work of Ioannidis see the Atlantic article “Lies, Damned Lies, and Medical Science” by David Freedman (2010) at http://bit.ly/11aAmt0.
*******************************************************

To access the complete 9 kB post please click on http://bit.ly/YfMESg.

Richard Hake, Emeritus Professor of Physics, Indiana University
Links to Articles: http://bit.ly/a6M5y0
Links to Socratic Dialogue Inducing (SDI) Labs: http://bit.ly/9nGd3M
Academia: http://bit.ly/a8ixxm
Blog: http://bit.ly/9yGsXh
GooglePlus: http://bit.ly/KwZ6mE
Twitter: http://bit.ly/juvd52

REFERENCES [URL shortened by http://bit.ly/ and accessed on 30 Nov 2012.]
Hake, R.R. 2012a. “The Randomistas’ War On Global Poverty - ERRATUM & ADDENDUM,” online on the OPEN! AERA-L archives at http://bit.ly/YfMESg. Post of 30 Nov 2012 12:15:33-0800 to AERA-L and Net-Gold. The abstract and link to the complete post are being transmitted to several discussion lists.

Hake, R.R. 2012b. “The Randomistas’ War On Global Poverty (was Chocolate Makes You Smart),” online on the OPEN! AERA-L archives at http://bit.ly/V93tXl. Post of 29 Nov 2012 14:27:16-0800 to AERA-L and Net-Gold. The abstract and link to the complete post are being transmitted to several discussion lists and are also on my blog “Hake'sEdStuff” at http://bit.ly/Rm6Oqw with a provision for comments.

Thursday, November 29, 2012

The Randomistas’ War On Global Poverty (was Chocolate Makes You Smart)

Some blog followers might be interested in a recent post “The Randomistas’ War On Global Poverty (was Chocolate Makes You Smart)” [Hake (2012a)]. The abstract reads:

**********************************************
ABSTRACT: In my post “Chocolate Makes You Smart!” [Hake (2012b)] at http://bit.ly/QheB7E, I wrote: “While awaiting Randomized Control Trials (RCT’s) in which the inhabitants of randomly selected countries are provided with placebos in place of chocolate. . . . .” In response Kevin Laws of the Physoc list pointed to: (a) a movement called the “Randomistas” led by MIT economist Esther Duflo http://bit.ly/TpiswJ, which holds that “RCT's are neither impossible nor immoral in the social sciences, but instead are required”; and (b) the fact that the Randomistas “have been responsible for resolving a number of long-standing philosophical debates with actual RCT data - the effectiveness of mosquito nets, for example.”

Although RCT’s may be the gold standard in medicine and global-poverty-reduction research by the Randomistas, they are certainly not that in education research generally - see e.g., (a) “Randomized Control Trials: The Strange Case of the Contradictory Graphs” [Hake (2012b)] at http://bit.ly/TQdfhX; (b) “A Response to ‘It's Not All About Class Size’ ” [Hake (2009)] at http://bit.ly/KBzuXV; and (c) the present signature quote. Nevertheless, RCT’s seem to have been used effectively in education research by the Randomistas according to information at http://bit.ly/TtTRae.

Despite the Randomista’s concern for education, there’s no indication that the “Center for Economic and Policy Research” (CEPR) http://bit.ly/U1DTSU, for which Duflo is a Program Director for Developmental Economics http://bit.ly/VgQ3y8, is aware of the overriding influence of poverty on the educational achievement of U.S. children, as emphasized in many references in the present post. This despite the fact that, according to information at http://bit.ly/U1DTSU, CEPR is concerned in part with “gaps in the social policy fabric of the U.S. economy.”
**********************************************

To access the complete 21 kB post please click on http://bit.ly/V93tXl.

“In some quarters, particularly medical ones, the randomized experiment is considered the causal ‘gold standard.’ It is clearly not that in educational contexts, given the difficulties with implementing and maintaining randomly created groups, with the sometimes incomplete implementation of treatment particulars, with the borrowing of some treatment particulars by control group units, and with the limitations to external validity that often follow from how the random assignment is achieved.”
- Thomas Cook and Monique Payne in "Evidence Matters"
[Mosteller & Boruch (2002)]

Richard Hake, Emeritus Professor of Physics, Indiana University
Links to Articles: http://bit.ly/a6M5y0
Links to Socratic Dialogue Inducing (SDI) Labs: http://bit.ly/9nGd3M
Academia: http://bit.ly/a8ixxm
Blog: http://bit.ly/9yGsXh
GooglePlus: http://bit.ly/KwZ6mE
Twitter: http://bit.ly/juvd52

REFERENCES [URL’s shortened by http://bit.ly/ and accessed on 29 Nov 2012.]
Hake, R.R. 2012a. “The Randomistas’ War On Global Poverty (was Chocolate Makes You Smart)” online on the OPEN! AERA-L archives at http://bit.ly/V93tXl. Post of 29 Nov 2012 14:27:16-0800 to AERA-L and Net-Gold. The abstract and link to the complete post are being transmitted to several discussion lists.

Hake, R.R. 2012b. “Chocolate Makes You Smart!” online on the OPEN! AERA-L archives at http://bit.ly/QheB7E. Post of 24 Nov 2012 10:34:31-0800 to AERA-L and Net-Gold. The abstract and link to the complete post are being transmitted to several discussion lists and are also on my blog “Hake'sEdStuff” at http://bit.ly/10Nkcoa with a provision for comments.

Mosteller, F. & R. Boruch, eds. 2002. Evidence Matters: Randomized Trials in Education Research. Brookings Institution, publisher’s information at http://bit.ly/UoX3sA. Amazon.com information at http://amzn.to/n6T0Uo. An expurgated Google book preview is online at http://bit.ly/RX1k3u.

Saturday, November 17, 2012

Randomized Control Trials: The Strange Case of the Contradictory Graphs (was In Defense of the NRC's Scientific Research in Education)

Some blog followers might be interested in a recent post “Randomized Control Trials: The Strange Case of the Contradictory Graphs (was In Defense of the NRC's Scientific Research in Education)” [Hake (2012)]. The abstract reads:

**************************************************
ABSTRACT: Susan Skidmore at http://bit.ly/Uov4sU alerted the Math-Teach list to her valuable articles with Bruce Thompson in the June/July 2012 issue of the Educational Researcher: (a) “Propagation of Misinformation About Frequencies of RFTs/RCTs in Education: A Cautionary Tale” [Skidmore & Thomson (2012a)] at http://bit.ly/SPN361, and (b) “Things (We Now Believe) We Know” [S&T (2012b)] at http://bit.ly/ZSQ5v5.

S&T (2012a) discuss the CONTRADICTORY GRAPHS of cumulative numbers of Randomized Control Trials (RCT’s) vs time for Criminology, Education, Psychology, and Social fields (showing education first, tied for second, and last) presented by influential scholars in prominent settings [that, along with the attendant sequence of events] “may have gratuitously damaged the already fragile reputation of education research as a field.”

After reviewing the history, S&T (2012b) conclude: “We believe that the errors were unintentional . . . . . . “ But the history as recounted by S&T (2012a) [and in the same Educational Researcher issue by Robinson (2012) at http://bit.ly/WHhdiU and Petrosino (2012) at http://bit.ly/SU3K3O] seems to contradict S&T’s conclusion.

Thomas Cook submitted an article with the title “A critical appraisal of the case against using experiments to assess school (or community) effects” [Cook (2001a)] at http://bit.ly/Uyd3CY with NO GRAPH to the Hoover Institution's Education Next http://educationnext.org/ Evidently without Cook's knowledge, his academic article was heavily edited and published as “Sciencephobia: Why education researchers reject randomized experiments” [Cook (2001b)] at http://bit.ly/SQox50 WITH A GRAPH of cumulative numbers of Randomized Control Trials (RCT’s) vs time for Criminology, Education, Psychology, and Social fields showing education LAST, consistent with the provocative new title. The graph was erroneously attributed to Boruch, De Moya, & Snyder (2001) - the data should have been 2002) - at http://bit.ly/UoX3sA, despite the fact that the Boruch et al. graph showed education tied for second, not last. Are we to believe that Education Next's degradation of the accurate academic Cook (2001a) to the inaccurate hooverized Cook (2001b) was unintentional?

A side issue: to those who regard RCT’s as the “gold standard” of education research, the higher the curve of cumulative numbers of Randomized Control Trials (RCT’s) vs time for a field, the higher the merit of research in that field. But not everyone would agree - see e.g., “A Summative Evaluation of RCT Methodology: & An Alternative Approach to Causal Research” [Scriven (2008] at http://bit.ly/93VcWD, “Seventeen Statements by Gold-Standard Skeptics #2” [Hake (2010)] at http://bit.ly/TNpTR9, and the present signature quote of Thomas Cook and Monique Payne.
**************************************************

To access the complete 46 kB post please click on http://bit.ly/TQdfhX.

Richard Hake, Emeritus Professor of Physics, Indiana University
Links to Articles: http://bit.ly/a6M5y0
Links to Socratic Dialogue Inducing (SDI) Labs: http://bit.ly/9nGd3M
Academia: http://bit.ly/a8ixxm
Blog: http://bit.ly/9yGsXh
GooglePlus: http://bit.ly/KwZ6mE
Twitter: http://bit.ly/juvd52

“In science education, there is almost nothing of proven efficacy.”
- Grover Whitehurst, former director, Institute of Education Sciences, USDE, as quoted by Sharon Begley (2004)

“In some quarters, particularly medical ones, the randomized experiment is considered the causal ‘gold standard.’ It is clearly not that in educational contexts, given the difficulties with implementing and maintaining randomly created groups, with the sometimes incomplete implementation of treatment particulars, with the borrowing of some treatment particulars by control group units, and with the limitations to external validity that often follow from how the random assignment is achieved.”
- Thomas Cook and Monique Payne in Evidence Matters [Mosteller & Boruch (2002)]

“Physics educators have led the way in developing and using objective tests to compare student learning gains in different types of courses, and chemists, biologists, and others are now developing similar instruments. These tests provide convincing evidence that students assimilate new knowledge more effectively in courses including active, inquiry-based, and collaborative learning, assisted by information technology, than in traditional courses.”
- Wood & Gentile (2003)

REFERENCES [URL’s shortened by http://bit.ly/ and accessed on 17 Nov 2012.]
Begley, S. 2004. “To Improve Education, We Need Clinical Trials To Show What Works,” Wall Street Journal, 17 December, page B1; online as a 41 kB pdf at http://bit.ly/SSmaym, thanks to David Klahr.

Hake, R.R. 2012. “Randomized Control Trials: The Strange Case of the Contradictory Graphs (was In Defense of the NRC's Scientific Research in Education)” online on the OPEN! AERA-L archives at http://bit.ly/TQdfhX. Post of 17 Nov 2012 10:45:11-0800 to AERA-L and Net-Gold. The abstract and link to the complete post are being transmitted to several discussion lists.

Mosteller, F. & R. Boruch, eds. 2002. Evidence Matters: Randomized Trials in Education Research. Brookings Institution, publisher's information at http://bit.ly/UoX3sA. Amazon.com information at http://amzn.to/n6T0Uo. An expurgated Google book preview is online at http://bit.ly/RX1k3u.

Wood, W.B., & J.M. Gentile. 2003. “Teaching in a research context,” Science 302: 1510; 28 November; online as a 213 kB pdf at http://bit.ly/SyhOvL thanks to Portland State's “Ecoplexity” site.

Monday, May 21, 2012

Re: How Reliable Are the Social Sciences?

Some blog followers might be interested in a recent discussion-list post “Re: How Reliable Are the Social Sciences?” The abstract reads:

****************************************************
ABSTRACT: Rick Froman of the TIPS discussion list has pointed to a New York Times Opinion Piece “How Reliable Are the Social Sciences?” by Gary Gutting at http://nyti.ms/K0xVQL. Gutting wrote that Obama, in his State of the Union address http://wapo.st/JnuBCO cited “The Long-Term Impacts of Teachers: Teacher Value-Added and Student Outcomes in Adulthood” (Chetty et al., 2011) at http://bit.ly/KkanoU to support his emphasis on evaluating teachers by their students' test scores. That study purportedly shows that students with teachers who raise their standardized test scores are “more likely to attend college, earn higher salaries, live in better neighborhoods, and save more for retirement.”

After comparing the reliability of social-science research unfavorably with that of physical-science research, Getting wrote [my italics): “is there any work on the effectiveness of teaching that is solidly enough established to support major policy decisions? the case for a negative answer lies in the [superior] predictive power of the core natural sciences compared with even the most highly developed social sciences.”

Most education experts would probably agree with Getting's negative answer. Even economist Eric Hanushek http://en.wikipedia.org/wiki/Eric_Hanushek, as reported by Lowery http://nyti.ms/KnRvDh, states: “Very few people suggest that you should use value-added scores alone to make personnel decisions.”

But then Getting goes on to write (slightly edited): “While the physical sciences produce many detailed and precise predictions, the social sciences do not. The reason is that such predictions almost always require randomized controlled trials (RCT’s) which are seldom possible when people are involved. . . . . . Jim Manzi . . . . . . . . . . .[[according to Wikipedia http://bit.ly/KqMf1M, a senior fellow at the conservative Manhattan Institute http://bit.ly/JvwKG1 ]]. . . . in his recent book Uncontrolled http://amzn.to/JFalMD offers a careful and informed survey of the problems of research in the social sciences and concludes that non-RCT social science is not capable of making useful, reliable, and nonobvious predictions for the effects of most proposed policy interventions.” BUT:

(1) Randomized controlled trails may be the “gold standard” for medical research, but they are not such for the social science of educational research - see e.g., “Seventeen Statements by Gold-Standard Skeptics #2” (Hake, 2010) at http://bit.ly/oRGnBp .

(2) Unknown to most of academia, and probably to Getting and Manzi, ever since the pioneering work of Halloun & Hestenes (1985a) at http://bit.ly/fDdJHm, physicists have been engaged in the social science of Physics Education Research that is “capable of making useful, reliable, and nonobvious predictions,” e.g., that “interactive engagement” courses can achieve average normalized pre-to-posttest gains which are about two-standard deviations above comparison courses subjected to “traditional” passive-student lecture courses. This work employs pre/post testing with Concept Inventories http://en.wikipedia.org/wiki/Concept_inventory - see e.g., (a) “The Impact of Concept Inventories on Physics Education and It’s Relevance For Engineering Education” (Hake, 2011) at http://bit.ly/nmPY8F, and (b) “Why Not Try a Scientific Approach to Science Education?” (Wieman, 2007) at http://bit.ly/anTMfF.
****************************************************

To access the complete 26 kB post please click on http://bit.ly/K432fC.

Richard Hake, Emeritus Professor of Physics, Indiana University
rrhake@earthlink.net
Links to Articles: http://bit.ly/a6M5y0
Links to SDI Labs: http://bit.ly/9nGd3M
Blog: http://bit.ly/9yGsXh
Academia: http://iub.academia.edu/RichardHake
Twitter https://twitter.com/#!/rrhake

“In some quarters, particularly medical ones, the randomized experiment is considered the causal ‘gold standard.’ It is clearly not that in educational contexts, given the difficulties with implementing and maintaining randomly created groups, with the sometimes incomplete implementation of treatment particulars, with the borrowing of some treatment particulars by control group units, and with the limitations to external validity that often follow from how the random assignment is achieved.”
- Tom Cook & Monique Payne (2002, p. 174)

“. . .the important distinction. . .[between, e.g., education and physics]. . . is really not between the hard and the soft sciences. Rather, it is between the hard and the easy sciences.”
- David Berliner (2002)

“Physics educators have led the way in developing and using objective tests to compare student learning gains in different types of courses, and chemists, biologists, and others are now developing similar instruments. These tests provide convincing evidence that students assimilate new knowledge more effectively in courses including active, inquiry-based, and collaborative learning, assisted by information technology, than in traditional courses.”
- Wood & Gentile (2003)

REFERENCES [All URL’s shortened by http://bit.ly/ and accessed on 21 May 2012.]
Berliner, D. 2002. “Educational research: The hardest science of all,” Educational Researcher 31(8): 18-20; online as a 49 kB pdf at http://bit.ly/GAitqc.

Cook, T.D. & M.R. Payne. 2002. “Objecting to the Objections to Using Random Assignment in Educational Research” in Mosteller & Boruch (2002).

Hake, R.R. 2012. “Re: How Reliable Are the Social Sciences?” online on the OPEN! AERA-L archives at http://bit.ly/K432fC. Post of 20 May 2012 20:08:07-0700 to AERA-L and Net-Gold. The abstract and link to the complete post are also being transmitted to several discussion lists.

Mosteller, F. & R. Boruch, eds. 2002. Evidence Matters: Randomized Trials in Education Research. Brookings Institution. Amazon.com information at http://amzn.to/n6T0Uo. A searchable expurgated Google Book Preview is online at http://bit.ly/mTcPIE.

Wood, W.B. & J.M. Gentile. 2003. “Teaching in a research context,” Science 302: 1510; 28 November; online to subscribers at http://bit.ly/9izfFz. A summary is online to all at http://bit.ly/9qGR6m.

Monday, July 11, 2011

Re: controlled experiments

Some blog followers might be interested in a discussion-list post “Re: controlled experiments” [Hake (2011)].

The abstract reads:

*********************************************
ABSTRACT: PhysLnrR’s Brian Foley wrote (paraphrasing):

“I would love to be doing controlled experiments much of the time - but they are darn near impossible to pull off. . . . . . Once you have your 40 approved classrooms, you RANDOMLY SELECT 20 TEACHERS and train them on your innovation. . . . . Also your 20 control teachers need to be using their ‘traditional’ teaching (and specifically teach like they have never heard of your innovation. . . . . and if you have designed some good assessments of learning, then you will finally have your result. . . . . .and if your innovation makes a difference YOU JUST MIGHT GET THE MAGICAL p less than 0.05 RESULT.”

Brian seems to have succumbed to the siren calls of the Gold Standardistas and the Statistical Significance Cultists. Modesty forbids mention of these possible antidotes:

a. “Should Randomized Control Trials Be the Gold Standard of Educational Research?” at http://bit.ly/qrUfFz ,

b. “Seventeen Statements by Gold-Standard Skeptics #2” at http://bit.ly/oRGnBp ,

c. “The Cult of Statistical Significance” at http://bit.ly/dkTyXP .
*********************************************

To access the complete 10 kB post please click on http://bit.ly/onA7jk .

Richard Hake, Emeritus Professor of Physics, Indiana University Honorary Member, Curmudgeon Lodge of Deventer, The Netherlands
President, PEdants for Definitive Academic References which Recognize the Invention of the Internet (PEDARRII)

rrhake@earthlink.net
http://www.physics.indiana.edu/~hake
http://www.physics.indiana.edu/~sdi
http://HakesEdStuff.blogspot.com
http://iub.academia.edu/RichardHake

“In some quarters, particularly medical ones, the randomized experiment is considered the causal ‘gold standard.’ IT IS CLEARLY NOT THAT IN EDUCATIONAL CONTEXTS, given the difficulties with implementing and maintaining randomly created groups, with the sometimes incomplete implementation of treatment particulars, with the borrowing of some treatment particulars by control group units, and with the limitations to external validity that often follow from how the random assignment is achieved.”
- Tom Cook & Monique Payne (2002, p. 174)

“After 4 decades of severe criticism, the ritual of null hypothesis significance testing - mechanical dichotomous decisions around a sacred 0.05 criterion - still persists. This article reviews the problems with this practice, including its near-universal misinterpretation of p as the probability that Ho . . . .[[the null hypothesis]]. . . . is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects Ho one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, and emphasis on effect sizes using confidence intervals, and the informed use of available statistical methods is suggested. FOR GENERALIZATION, PSYCHOLOGISTS MUST FINALLY RELY, AS HAS BEEN DONE IN THE OLDER SCIENCES, ON REPLICATION.” [My CAPS.]
- Jacob Cohen (1994) in “The earth is round (p less than 0.05)”

REFERENCES [URL’s shortened by http://bit.ly/ and accessed on 11 July 2011.]

Cook, T.D. & M.R. Payne. 2002. “Objecting to the Objections to Using Random Assignment in Educational Research” in Mosteller & Boruch (2002).

Cohen, J. 1994. “The earth is round (p less than 0.05),” American Psychologist 49: 997-1003; online as a 1.2 MB pdf at http://bit.ly/a45I2t thanks to Christopher Green http://www.yorku.ca/christo/.

Hake, R.R. 2011. “Re: controlled experiments,” online on the OPEN! AERA-L archives at http://bit.ly/onA7jk. Post of 11 Jul 2011 11:15:54-0700 to AERA-L, Net-Gold, and PhysLrnR. The abstract and link to the complete post are being transmitted to various discussion lists.

Mosteller, F. & R. Boruch, eds. 2002. Evidence Matters: Randomized Trials in Education Research. Brookings Institution. Amazon.com information at http://amzn.to/n6T0Uo . A searchable expurgated Google Book Preview is online at http://bit.ly/mTcPIE .

Hake'sEdStuff