This is the talk page for discussing improvements to the Randomized controlled trial article. This is not a forum for general discussion of the article's subject. |
Article policies |
Find medical sources: Source guidelines · PubMed · Cochrane · DOAJ · Gale · OpenMD · ScienceDirect · Springer · Trip · Wiley · TWL |
This level-4 vital article is rated B-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||||||||||||||||
|
Daily pageviews of this article
A graph should have been displayed here but graphs are temporarily disabled. Until they are enabled again, visit the interactive graph at pageviews.wmcloud.org
|
Kudos on the limitations section, it is an accurate and concise enumeration of the consensus issues. Unfortunately, the sections on Randomization and Limitations are disjointed.
It is probably easier to focus on scientific limitations and not conflict of interest biases. To some extent, overcoming the limitations is about effective study design, which is probably too much to summarize here. Nonetheless, the authors of this page have hinted at suggestions with a very well written section on "Randomization".
But again, "Randomization" and "limitations" doesn't lead the reader to consider "ok now what do I do about it?" — Preceding unsigned comment added by 24.5.84.124 (talk) 03:29, 29 December 2012 (UTC)[reply]
Given our understanding of its limitations, isn't it time that we discourage references to RCT as a "gold standard?" It is a very fine tool, among many. It provides the best answer to a very specific question. But to get the right answer, we must ask the right question, and for many significant clinical (as well as social science) questions, RCT may not be the best tool at all. — Preceding unsigned comment added by 75.39.140.235 (talk) 14:59, 13 August 2013 (UTC)[reply]
Why are there empty sub-headings? Guardian 00:15, 14 July 2006 (UTC)[reply]
which to use in RCT, controlled vs control, is debatable. the number of google hits is 10x larger for the former so we should name it first as it is more widely accepted and understood. —The preceding unsigned comment was added by Mebden (talk • contribs) 10:16, 10 January 2007 (UTC).[reply]
Just noticed that the whole section under Urn randomisation is an exact copy of the section before.DanHertogs 15:22, 19 April 2007 (UTC)[reply]
Challanging content: In intro paras -I dont believe randomisation ensures equal allocation - you can still have unequal allocation of confounding factors if you are unlucky. Others agree? (I have made no edits)
Should there be a section discussing HOW random the random allocations are? There are big statistical differences between sampling a stochastic process, using a pseudorandom number generator with adequate cycle length, and calling RAND() in Excel ... And what impact on Phase I, II or III trials might these differences have? Is it standard practice for study designers to describe randomization techniques? Daen (talk) 14:17, 19 November 2008 (UTC)[reply]
This article should mention the difficulties associated with having to randomise clusters of individuals. An example is when a single intervention must be used for all subjects at a particular location for some reason of practicality, e.g. interventions are methods of care which are randomised to clinics or hospitals, therefore patients are clustered. I've just created cluster randomised controlled trial to delve into that topic more deeply, as time permits. Tayste (edits) 22:56, 21 February 2010 (UTC)[reply]
This is actually a challenging topic. I suggest referencing the "status quo" methods with a real world example for these complex topics (clusters and correlation). ANOVA is often used to ask such questions: "what is the measured variance within a group VS what is the measured variance between groups". Even with these groupwise statistical tests, the problem is at least as hard as figuring out a way to "measure" the similarity between a single patient sample observation and the "expected" observation. The more criteria we add, the more we run into multiple hypothesis problems Type-I errors. In a nutshell: even the simple problem is hard and becomes much harder when you consider the vast number of ways to "measure" the "distance" between two patient "samples".
Correlation is a very broad topic. Are you trying to correlate (and thereby cluster) patient samples, study features, or whole populations? I have had to review these issues as part of a informatics doctoral thesis. The more complex the method the less likely it is to be adopted in a RCT (or clinical practice).
http://web.psych.unimelb.edu.au/jkanglim/ANOVAandMANOVA.pdf http://en.wikipedia.org/wiki/Anova http://en.wikipedia.org/wiki/F-test http://en.wikipedia.org/wiki/Crossover_study
Should the article have a section on how to estimate the sample size that would be necessary in order to detect an effect of a given size? Or is that covered somewhere else in WP? Tayste (edits) 23:14, 21 February 2010 (UTC)[reply]
The brackets in the opening paragraph make the introduction to this article read really, really badly. I'm going to try make it a bit better. Tkenna (talk) 00:22, 6 May 2012 (UTC)[reply]
This is one of the most important medical articles in Wikipedia. People can't understand medicine without understanding randomized, controlled trials. If people can't understand the introduction, they'll never get through the rest of the entry.
And yet this reads like an academic paper, written for people who already know what a randomized controlled trial is, written to show off how many polysyllabic medical terms the writer knows. It defines "randomized controlled trial" with the term "clinical trial." If a reader doesn't know what a randomized controlled trial is, they're not likely to know what a clinical trial is either. Before you use a term like "clinical trial," you have to explain what it is. In fact, most readers who don't know what a randomized controlled trial is won't know what a "scientific experiment" is either. This introduction has to be completely rewritten in plain English, preferably with one or more WP:RSs to support it. I would suggest seeing how professional writers, like New York Times reporters, have done it, rather than trying to create a definition out of your own head.
WP:NOTJOURNAL "Scientific journals and research papers. A Wikipedia article should not be presented on the assumption that the reader is well versed in the topic's field. Introductory language in the lead and initial sections of the article should be written in plain terms and concepts that can be understood by any literate reader of Wikipedia without any knowledge in the given field before advancing to more detailed explanations of the topic. While wikilinks should be provided for advanced terms and concepts in that field, articles should be written on the assumption that the reader will not or cannot follow these links, instead attempting to infer their meaning from the text." --Nbauman (talk) 06:31, 24 March 2014 (UTC)[reply]
RCT may also refer to randomized clinical trials. How we can incorporate this information? And which article wold be better to incorporate this information. Is this the good article to put this information? --Abhijeet Safai (talk) 06:12, 25 November 2012 (UTC)[reply]
In the article a RCT and an RCT are used concurrently. Which one should be used? --130.229.5.232 (talk) 14:57, 30 April 2015 (UTC)[reply]
Currently this excellent article contains a serious error. It states that "groups receiving the experimental treatment are compared with control groups receiving no treatment (a placebo-controlled study) or a previously tested treatment (a positive-control study)."
A no treatment or wait list group is not a placebo group. A wait list group is a group that is treated (for ethical reasons) after the wait period. By way of contrast, a placebo group is a group that receives an inert substance or treatment, not no treatment. This engages the placebo response, the treatment effect of receiving something that elicits the expectation effect. The placebo effect is substantial, eliciting improvement of over 30% in most studies.
One other minor quibble; in my field (behavioral psychology) what the authors refer to as "a positive-control study" is usually called an "active treatment." Grenheldas (talk) 04:08, 6 May 2016 (UTC)Grenheldas[reply]
Dr. Peters has reviewed this Wikipedia page, and provided us with the following comments to improve its quality:
General comment:
The article is mostly dealing with medical RCTs, which is of course fine since most of RCTs have taken place in the medical sector. Yet, in some parts, for example external validity, arguments that do not apply to medical studies but to social science studies are made. It would thus be better to structure the article in a way that this is more obvious. Two solutions are possible: Either address the differences between medical and social science RCTs already in the beginning and thus, discuss social and medical RCTs side by side in each section or splitting the article in two parts, with medical RCTs in the beginning and a section on social science RCTs in the end (as it is done to some extent at the moment but simply clearer).
Update to section 10: Randomized controlled trials in social science RCTs have recently gained attention in social sciences. In the field of economics, for example, a shift from theoretical studies to empirical work, particularly experiments, can be noted for the last decades (Hammermesh 2013). While the method is the same as in medical research, conducting RCTs in order to evaluate policy measures is different to medical RCTs when it comes to implementation. Several researchers have discussed these issues, which include, for example, choosing the right level of randomization, data collection or alternative randomization techniques (see, for example, Glennerster and Takavarasha 2013 or Duflo et al. 2008). Although RCTs have improved the internal validity of studies in the social science disciplines by minimizing selection bias in the last decade, they struggle with external validity, also in comparison to medical RCTs since issues like general equilibrium effects do not occur in medical RCTs. A recent systematic review by Peters, Langbein and Roberts (2016) analyzed 94 published articles in top economics journal between 2009 and 2014 and found that a majority of studies do not take external validity issues into account properly.
Update to section 10.2: International development section: RCTs have been applied in a number of topics throughout the world. A prominent example is the PROGRESA evaluation in Mexico, where conditional cash transfers were found to be beneficial on a number of levels for rural families and, based on the results of the RCT, the government introduced this as a policy (studies using PROGRESA are, among others, Attanasio et al. (2012) or Gertler (2004)). Other domains with evidence from a large array of interventions in developing countries include, among others, health (for example Miguel and Kremer 2003 or Dupas 2014), (micro-)finance sector (for example Tarozzi et al. (2014) or Karlan et al. (2014)) or education (for example Das et al. (2013) or Duflo et al. (2011, 2012).
Update to section 10.4 Education section: One of the first RCT in social science worldwide was the STAR experiment, which was started in 1985 and designed to determine the effect of small classes on short- and long-term pupil performance (for example Chetty et al. 2011).
Literature: Attanasio, O., Meghir, C. and Santiago, A. (2012). `Education Choices in Mexico: Using a Structural Model and a Randomized Experiment to Evaluate PROGRESA´, Review of Economic Studies, 79(1): 37-66. Chetty, R., Friedman, J. N., Hilger, N., Saez, E., Whitmore Schanzenbach, D. and Yagan, D. (2011). `How Does Your Kindergarten Classroom Affect Your Earnings? Evidence from Project Star´, Quarterly Journal of Economics, 126(4): 1593-1660. Das, J., Dercon, S., Habyarimana, J., Krishnan, P., Muralidharan, K. and Sundararaman, V. (2013). `School Inputs, Household Substitution, and Test Scores´, American Economic Journal: Applied Economics, 5(2): 29-57. Duflo, E., Dupas, P. and Kremer, M. (2011). `Peer Effects, Teacher Incentives, and the Impact of Tracking: Evidence from a Randomized Evaluation in Kenya´, American Economic Review, 101(5): 1739-1774. Duflo, E., Glennerster, R. and Kremer, M. (2008). `Using randomization in development economics research: a toolkit´, in (P. Schultz and J. Strauss, eds.), Handbook of Development Economics: 3895-3962, Amsterdam: North Holland. Duflo, E., Hanna, R. and Ryan, S. P. (2012). `Incentives Work: Getting Teachers to Come to School´, American Economic Review, 102(4): 1241-1278. Dupas, P. (2014). `Short-Run Subsidies and Long-Run Adoption of New Health Products: Evidence from a Field Experiment´, Econometrica, 82(1): 197-228. Gertler, P. (2004). `Do Conditional Cash Transfers Improve Child Health? Evidence from PROGRESA’s Control Randomized Experiment´, American Economic Review, 94(2): 336-341. Glennerster, R. and Takavarasha, K. (2013). `Running randomized evaluations – a practical guide´, Princeton University Press: Princeton and Oxford. Hamermesh, D.S. (2013). `Six Decades of Top Economics Publishing: Who and How?´, Journal of Economic Literature, 51 (1), 162-172. Karlan, D., Osei, R., Osei-Akoto, I. and Udry, C. (2014). `Agricultural Decisions after Relaxing Credit and Risk Constraints´, Quarterly Journal of Economics, 129(2): 597-652. Miguel, E. and Kremer, M. (2003). `Worms: Identifying Impacts on Education and Health in the Presence of Treatment Externalities´, Econometrica 72(1): 159-217. Peters, J., Langein, J. and Roberts, G. (2016). `Policy Evaluation, Randomized Controlled Trials, and External Validity – A Systematic Review´, Economics Letters, forthcoming. Discussion paper version published as: Ruhr Economic Papers 589: RWI.
Tarozzi, A., Mahajan, A., Blackburn, B., Kopf, D., Krishnan, L. and Yoong, J. (2014). `Micro-loans, Insecticide-Treated Bednets, and Malaria: Evidence from a Randomized Controlled Trial in Orissa, India´, American Economic Review, 104(7): 1909-1941.
We hope Wikipedians on this talk page can take advantage of these comments and improve the quality of the article accordingly.
We believe Dr. Peters has expertise on the topic of this article, since he has published relevant scholarly research:
ExpertIdeasBot (talk) 16:05, 24 August 2016 (UTC)[reply]
The article entitled "Why all randomised controlled trials produce biased results" - which has been repeatedly added despite being taken down several times - is riddled with errors and not a wise addition to this page. It has been widely panned by experts in trial methdology for several deeply inaccurate and misleading statements. Specifics include
- complete misunderstanding of the purpose and necessity of achieving "balance" in trials, ignoring extensive historical literature on the subject from world leaders in this area:
Altman DG. Comparability of randomised groups. The Statistician 1985; 34, 125-136.
Senn SJ. Testing for baseline balance in clinical trials. Statistics in Medicine 1994; 13:1715–1726
Senn SJ. Baseline balance and valid statistical analyses: common misunderstandings. Applied Clinical Trials 2005; 14:24–27.
Senn SJ. Seven myths of randomization in clinical trials. Statistics in Medicine 2013; 32L 1439-1450.
- suggestion of "re-randomisation" in pursuit of greater balance that (putting aside above statement about misunderstanding the need for balance) would be impossible to implement for the majority of RCT's (most major trials take several years to enroll their full study cohort; it is neither feasible nor desirable to wait until the entire cohort is enrolled to begin treating patients) and also ignores the purpose of "randomisation" as well as better strategies to achieve balance such as covariate-adaptive randomisation and minimisation
- discussion of "simple-treatment-at-the-individual-level" limitation ignores evolutions in trial design, such as I-SPY 2, that allow multiple comparisons of complex treatment combinations across and within specific patient subgroups. Appears to be entirely unaware of advancing literature in this area.
- discussion in "small sample bias" section makes a hilariously wrong statement about probabilities:
"An example is that the stroke trial with 624 participants reports that at 3 months after the stroke, 54 treated patients died compared to 64 placebo patients. This main outcome is the same likelihood as getting 10 more heads than tails by flipping a coin 624 times. "
This is not even close to being correct.
The coin-flip scenario refers to the probability of getting 317 heads in 624 tosses of a fair coin, which is a relatively simple problem to compute based on a series of Bernoulli trials with p=0.5 for a head on a single toss; the probability of getting exactly 317 heads in 624 tosses is about 2.9 percent and the probability of getting 317 or more heads is about 35.9 percent.
The probability of the observed results in the stroke trial is a more complex calculation with several additional parameters to estimate: the probability that we would observe the event rate in one treatment arm (54 deaths in 312 patients) versus the event rate in the other treatment arm (64 deaths in 312 patients) under the null hypothesis that the two arms share an unknown success probability (and that’s before we account for the timing of events as well).
If one must have a simplified analogy, it is somewhat closer to the probability of getting 54 sixes on 312 rolls of the treatment die versus 64 sixes on 312 rolls of the placebo die, although still not quite correct, that would have been a much closer description to this.
It is highly distressing to see such a misunderstanding of the probabilities used to assess trial results on display in an article by someone concerned with improving trial methodology.
- discussion of the unique time period assessment ignores statistical techniques specifically designed to analyze unequal follow-up time
- discussion of the background-traits-remain-constant assumption ignores substantive literature on mediation analysis in clinical trials to determine whether changes in participant behaviors are explanation for presence or absence of treatment effect
- discussion of "average treatment effects limitation" ignores the existence of adaptive-enrichment designs and other innovations in trial design that derive personalized estimates for efficacy from larger trial's results.
- misleading statement about a trial's results only "generalizing to 3 percent of the US population" - the trial in question was targeted at a specific population of people at high risk for diabetes; there is no concern whether the findings apply to newborn infants, kindergartners or 99-year-olds (all of whom are also in the "US population" but have no need for this trial program). This is like criticizing a breast cancer trial's results for not being generalizable to men. — Preceding unsigned comment added by 128.147.197.37 (talk) 20:30, 14 May 2018 (UTC)[reply]
The Krauss article on which text in the Wikipedia article is based is ill-informed and completely invalid. It will not only confuse the readers of the Wikipedia article but will cause harm because readers will be left with the idea that rigorous randomized experiments have shortcomings that they simply do not have. Krauss has no training that qualified him to write the Annals of Medicine article in the first place. For blatant misunderstandings to be published when virtually all of us who have been doing research on clinical trials methodology for decades fundamentally disagree with Krauss is hard to understand. It is wrong for Wikipedia to perpetuate ideas that should never have been accepted in a peer-reviewed journal in the first place. Harrelfe (talk) 13:34, 19 May 2018 (UTC)[reply]
You apparently did not see the reference above (https://www.bmj.com/content/361/bmj.k1561/rr) to the letter to the editor that has been published about this article. The letter is written by individuals who have studied randomized clinical trials in detail. Though one might legitimately disagree about claims against published literature in general, the article in question has been refuted by two of the premier medical statisticians in the world, Altman and Senn, whose credentials are impeccable. The quality of Krauss' article is the same as an article on economics that I as a biostatistician would write. Harrelfe (talk) 19:34, 19 May 2018 (UTC)[reply]
(Note: the following was posted on my talk page by someone unfamiliar with Wikipedia and with where we discuss things. I am moving it here. --Guy Macon (talk) 14:20, 20 May 2018 (UTC))[reply]
A comment for the wiki talk:
I am not familiar with how wiki talks work - I just read wiki articles. A colleague told me about this wiki talk that you have contributed to: https://en.wikipedia.org/wiki/Talk%3ARandomized_controlled_trial#Krauss_article_referenced_under_"Disadvantages". It is worth mentioning that Krauss (the author of the article) responded to the response by Senn and Altman as seen here https://www.bmj.com/content/361/bmj.k1561/rr-0 Also, the Krauss article was peer-reviewed and published in the journal Annals of Medicine. The responses on that piece from Senn and Altman (https://www.bmj.com/content/361/bmj.k1561/rr) and in turn from Krauss (https://www.bmj.com/content/361/bmj.k1561/rr-0) and other comments were however not peer-reviewed and are not 'articles' but just replies or comments that anyone can submit and that do not go through any peer review process. I think the reference to Krauss's article should remain on that wiki page, because the sentence from the article included on the wiki page is factual, and because the Krauss article has gone through the peer-review process and been published in the journal Annals of medicine, while none of the other comments or responses have gone through the peer-review process or been published - they are replies that anybody can submit and have uploaded on BMJ's website. I hope you agree. — Preceding unsigned comment added by 2A02:908:1A7:6D40:E999:52BC:6989:33A9 (talk) 13:33, 20 May 2018 (UTC)[reply]
The fact that Krauss answered the letter to the editor is of no consequence. He's still very much mistaken even though he raised a couple of points that are correct. Are you saying that you will trust the research of someone not trained in clinical trials over Senn and Altman who have a combined 80 years of direct experience in researching and writing about clinical trials and have been involved in the conduct of dozens of clinical trials? And the peer-review process at Annals of Medicine was highly defective. Annals of Medicine made a major blunder in accepting a highly inaccurate article in which the author had major misunderstandings about how clinical trials work. That paper would never have been accepted in Annals of Internal Medicine or other major medical journals. I've been directly involved in clinical trials for 40 years myself so that makes 120.
Put another way, the number of articles written about randomized clinical trials by knowledgeable writers in respected journals number more than 10,000. Why should the Krauss article be cited instead of these? And if you want to know about one of the major mistakes Krauss makes, read my blog article explaining why randomized clinical trials are more generalizable than even optimists believe. My qualifications are listed here. Harrelfe (talk) 00:29, 21 May 2018 (UTC)[reply]
A separate issue is that, if you look at the original posting user's contribution history, you'll find that all of his edits involve inserting citations to articles by Alexander Krauss, including this one. Seems like a pretty obvious case of WP:CITESPAM. The user hasn't engaged in any edit warring, but is inserting what are pretty clearly self-references willy-nilly, so I'm not quite sure how to handle this situation. WeakTrain (talk) 17:02, 15 July 2018 (UTC)[reply]
Is it worth pointing out this the single citation to Krauss has been re-added, under criticisms? Thank you. 77.75.244.17 (talk) 11:32, 9 March 2021 (UTC)[reply]
I'm having a hard time understanding the chart at the beginning of the article.
The left and right sides after "randomized" seem identical. I would have thought that one side would have received the intervention and the other side would not have received it (what ever the intervention was, getting a placebo or getting the drug under test, for example).
So this chart is confusing. Or confused and wrong. At the very least (if it is not wrong) it needs a better explanation. Bill Jefferys (talk) 03:14, 29 May 2019 (UTC)[reply]
All the sources below from "Further reading" are 12 or more years out of date. Archiving here. Zefr (talk) 20:43, 18 January 2021 (UTC)[reply]