The great replication crisis

There’s been a lot of commentary on a recent study by the Replication Project that attempted to replicate 100 published studies in psychology, all of which found statistically significant effects of some kind. The results were pretty dismal. Only about one-third of the replications observed a statistically significant effect, and the average effect size was about half that originally reported.

Unfortunately, most of the discussion of this study I’ve seen, notably in the New York Times, has missed the key point, namely the problem of publication bias. The big problem is that, under standard 20th century procedures, research reports will only be published if the effect observed is “statistically significant”, which, broadly speaking means that the average value of the observed effect is more than twice as large as the estimated standard error. According to the standard classical hypothesis testing theory, the probability that such an effect will be observed by chance, when in reality there is no effect, is less than 5 per cent.

There are two problems here, traditionally called Type I and Type II error. The classical hypothesis testing focuses on reducing Type I error, the possibility of finding an effect when none exists in reality, to 5 per cent. Unfortunately, when you do lots of tests, you get 5 per cent of a large number. If all the original studies were Type I errors, we’d expect only 5 per cent to survive replication.

In fact, the outcome observed in the Replication Study is entirely consistent with the possibility that all the failed replications are subject to Type II error, that is, failure to demonstrate an effect that is there in reality

I’m going to illustrate this with a numerical example[^1].

Suppose each of the 100 studies was looking at a treatment (any kind of intervention of change) of some kind, which results in shifting some variable of interest by 0.1 standard deviations (in the context of IQ test scores, for example, this would be a shift of 1.5 IQ points). Suppose the population parameters in the absence of treatment are known, and we have a sample of 225 treatments. We’d expect the sample mean value obtained in this way to be, on average 0.1 standard deviations higher than the value for the population at large. But the sample mean itself is a random variable, with a standard deviation equal to the population standard deviation divided by sqrt(225) = 15. That is, if we normalize the population distribution to have mean zero and standard deviation 1, the sample mean will have mean 0.1 and standard deviation 0.066. That in turn means that about 30 per cent of the observed samples will have a value greater than twice the sample standard deviation, which is roughtly the level required to find statistical significance.

Under best practice 20th procedure, the experimenters would report the effect if it passes the standard test for statistical significance, and dump the experiment otherwise[^2]. The resulting population of reported results will have an average effect size of around 0.2 population standard deviations [^3].

Now think about what happens when a study like this is replicated. There’s only a 30 per cent chance that the original finding of statistical significance will be repeated. Moreover, the average effect size will be close to the true effect size, which is half the reported effect size.

I don’t think that the results of the replications can be explained this way. At a rough guess, half of the observed failures were probably Type I errors in the original study, and half were Type II errors in the replication.

The broader problem is that the classical approach to hypothesis testing doesn’t have any real theoretical foundations: that is, there is no question to which the proposal “accept H1 if it would be true by chance only 5 per cent of the time, retain H0 otherwise” represents a generally sensible answer. But, we are stuck with it as a social convention, and we need to make it work better.

Replication is one way to improve things. Another, designed to prevent the kind of tweaking pejoratively referred to as ‘data mining’ or ‘data dredging’ is to require researchers to register the statistical model they plan to use before collecting the data. Finally, and what has been the dominant response in practice is to disregard the “95 per cent” number associated with classical hypothesis testing theory and to treat research findings as a kind of Bayesian update on our beliefs about the issue in question. If we have no prior beliefs one way or the other, a rough estimate is that a finding reported with “95 per cent” confidence is about 50 per cent likely to be right. Turning this around, and adding a little more scepticism, we get the provocative presentation of Ioannides “most published research results are wrong

[^1]: Which will probably include an error, since I’m prone to them, but a fixable one, since the underlying argument is valid.

[^2]: In reality, a more common response, especially with nearly-significant results, is to tweak the test until it is passed.

[^3]:I eyeballed this because I was too lazy to look up or calculation the truncated mean for the normal, so I’d appreciate it if a commenter would do my work for me

30 thoughts on “The great replication crisis

  1. On the other hand there is the approach taken by many in the physical sciences to publish research findings that are ‘interesting’ while leaving ‘interesting’ formally undefined.

  2. Thanks for the article John. Quick comments:

    1. This is after all psychology so maybe you shouldnt be expecting too much. (Just kidding?) Afterall like economics many of their research directions explore VERY vague semiqualitative concepts which are not defined based on a complete economics or psychology ‘Atomic Theory’. This is reflected in the plethora of different ‘schools of thought’ – like the Freudians v. the Skinnerian rat pyschologists both claimed to have a solution for Oedippus’s love of his mother – which you see with psychologists for even basic concepts such as ‘mental illness’. While we think we know what that means and loony bin crazy people are certainly crazy in a commonsense fashion there are so many grey areas which are seen when differentiation of environmentally induced, somatic, external agent created, genetic etc. is attempted that reductionist analysis is bound to have its limitations.

    2. In regards to stats if you have enough data it is possible to detect highly statististically significant trends say at a P.= 0.000000001. But when you look closer you may also see there is so much data that your factor of interest only accounts for say < 5% of the uncertainty or variance seen. So the statistical finding though highly significant can in the real world be minor/trivial/irrelevant and in the case of psychology obscure a more fundamental driver which hasnt been encountered or considered because you start from a crap theoretical grounding.

    Given this situation which do you publish or emphasize? The 10-9 significance of course because that is impressive. So I will be interested if I get the chance to have a look at the studies which they tried to replicate and see if they pass the laugh test.

    All this wouldnt matter if these studies were only just contributing to an evolving meta-paradigm in bits and pieces. But power lies in the ability to turn the finding into a model and from there market your bull&*^% in these times of constrained budgets and be enshrined as a guru/great teacher. This may sound unethical but a 10-9 significance is the sort of backing that would allow even me to believe in Bigfoot or Nessie.

    3. One of the great things I am current finding from immersion in Bayes Nets and Bayesian Inference is that the latter squarely frames the problem of what you do with i.e. infer from, a statistically significant finding and its limitations.

    Unlike the Frequentists the great thing about Bayesian thinking I've come to see is it confronts this problem of credible modelling and inference and the associated uncertainties. Bayes Net methods now have a bucket of measures of just how good your model is – cf MARCOT, B. G. 2012. Metrics for evaluating performance and uncertainty of Bayesian network models. Ecological Modelling, 230 50 – 62. While some disciplines had similar methods I can assure you when I learnt my parametric stats the method and philosophy of inference was not dealt with directly.

    More broadly this uncertainty stuff is scary in that it illustrates just how much what we like to think of as rationality and reasoning is problematic.

    It particularly bothers me as I have seen little evidence to date that my superannuation is being managed by people who have a clue about these issues.

    On the other hand if you can identify a model whose prediction capacity is useful and high well you have a winner you can believe in – given todays paradigms.

    Doublethink rules ok!

    4. In deference to the psychologists it must be acknowledged in all this that they have been some of the great promoters of serious statistical analysis and so deserved credit in that regard at least. So the problem you are highlighting may be more to do with the complexity and difficulty of this field.

  3. The problem is journal bias too — in psychology, many of the high impact journals want whizz-bang effects that generate publicity, look flashy, and are usually entirely theory free (e.g., Psychological Science is a prototype journal for this type of paper, and Social Psychology is the prototype discipline that generates this type of study). Psychology papers in Nature and Science follow this pattern even more (with Marc Hauser being the highest profile academic to go down for simply making up data, but the problem is everywhere and there are well known offenders in many areas who haven’t been caught — people just ignore their work if they know, although too bad for graduate students that fail to replicate it. Of the papers in the list in my area, I was able to pick who would and wouldn’t be replicated with almost 100% accuracy). So if you are expected to produce papers in high impact journals and arn’t very theoretically bent (or just egotistical or want a promotion etc.), meaning you can’t come up with works of substance (and many people can’t), it is no surprise what the outcome is.

  4. I had assumed that publication bias was a big part of the problem, the extreme version of which is:
    * 2000 studies are carried out.
    * 5% of them have a statistically significant result (at random).
    * These are the only studies published.

    That seems a much more common scenario than “the effect is real, but a bit smaller than the arbitrary threshold for significance”.

  5. A second issue with psychology experiments is that the sample sizes are as small as they can get away with, in part because of the expense of having more subjects, and in part—I suspect—because the supervisors of postgrad students coach them on how to play the statistics game. No-one wants to risk finding a null hypothesis is not rejected by the evidence, and for a lot of experiments that is a real risk; as a consequence, people lower the bar a bit, and they do this by keeping sample sizes small, and performing some level of selection of subjects from a larger pool of candidates. This gives a much more homogeneous sample at the expense of not reflecting the population from which that sample is supposedly drawn. Drug trials are notorious for this.

    Having seen first year statistics as taught to psychology students, there is a considerable difficulty that these students typically lack PEB/PES/HSC mathematics of any kind, and some of them avoided mathematics for the final two years of high school. Under these conditions, even otherwise fairly bright psychology students have a very limited appreciation of the subtleties of statistical procedures, and are generally stuck with following procedures as a protocol. I doubt that the concept of confounding variables would be common knowledge among psychology students, and conditional probability would be a tough one for them as well. Clearly some of the students do master these concepts and more power to them for that, but for the majority, I suspect it is asking too much, given their poor backgrounds in mathematics. Heck, probability and statistics are tricky enough for students with a solid mathematics foundation.

    In an ideal world, such students would shine at mathematics and statistics; certainly some of the early 20th century psychologists new their statistics theory very well, but they tended to have much more rounded educations than the modern school leaver (who chooses psychology at university) in Australia. Unfortunately, and this as been the case for at least a generation, high school students tend to split into those who favour the mathematical subjects (i.e. the so-called hard sciences), and those who avoid them; of those students going to university, the problem is compounded by the pro-mathematics students staying with the engineering, physics, mathematics, while the anti-mathematics students gravitate towards the subjects with a low—or no—bar with respect to prerequisite mathematical competency. So, in the real world, I believe that the psychology experiments would benefit from having a trained statistician involved from the outset. There shouldn’t be any penalty for posing an interesting question which nonetheless gives a null result. Posing a stupid question, on the other hand, shouldn’t get published.

    If there were some journals which set the condition that experiments must meet some particular standard in how they are run, requiring for instance the partnership with a statistician, then those journals would soon become the “Nature” of their disciplines, attracting the best researchers. It wouldn’t stop the problem of inadequate mathematics training of psychology researchers (some, not all), but it would lift the bar on what was necessary for the experimental aspects of psychology research.

    Finally, perhaps we need at least a couple of independent research groups to replicate a result before getting too excited about it. From the way our modern media carry on, every new result is an opportunity to go into overhype mode; the (now for-profit) universities are always seeking fresh meat to feed the PR machine, and this exacerbates the problem of premature announcement of new research results. With the neoliberal agenda for the university sector almost completed, I see no obvious way of putting the Gini back in that bottle.

  6. @Donald Oats
    Last para, first sentence, should read “…independent research groups to reproduce a result…” instead of “replicate”. Reproducing the result means doing the experiment over again, with a different sample, and drawing the same statistical inference. Replication is doing the same experiment with the same sample subjects, following the same procedure, and getting the same result. That is really just a check on the competence of the original researchers. A reproduction of a positive statistical result indicates the sought after effect is robust enough to survive testing of different sample groups—which you would want to be the case, as that’s the point.

  7. If the publication list on a CV had a threat of big red indelible stamps like “FAILED TO REPLICATE”, “METHODOLOGICAL ERRORS” “INCONSEQUENTIAL” this might improve things. However, if the metric of success is number of publications weighted by journal quality that’s what we’re likely to produce. It won’t matter too much that in several years half a researcher’s papers were duds because things will have moved on, research topics have changed, and they’re probably at a new job. What started out as the wonderful human creative capacity for gaming any system becomes incorporated in the norm.

  8. Historical note: Someone once said that university psychology students were ‘the most researched group in the history of science’ or words to that effect, because in the teaching of research methods and stats to psych students, the students themselves were a readily available and generally compliant (if not also malleable) population from which to draw a sample, whether for the lecturers’ and postgrads’ research or for teaching purposes. I saw something of this myself when doing Psych at UQ in the 1960s. Perhaps the replication problems are related to the fact that samples have to be drawn from elsewhere these days, or the students are different?

  9. @Peter Chapman
    In the 1980’s it was evident where I studied: there would be regular fliers requesting subjects for this or that psychology experiment. I did a few for $5 a time, really just “show up” money, but for a poor student it was five pints at the front bar.

    I used to wonder at the wisdom of using psych students as subject in psych experiments. On the one hand, psych students are going to become increasingly familiar with all the little tricks employed by the experimentalist, and can make an educated guess as to what is really going on; the more experiments they are used as subjects in, the more likely this will be the case. On the other hand, as a psych student being subjected to other experimentalist’s experiments, the student learns more about the whole process and what works, what doesn’t. For honours graduates doing experiments as part of their course requirements, there was always an air of desperation in acquiring sufficiently many guinea pigs to make the experiment viable. I would hope that postgrad researchers had funding adequate to support a less biased selection method than trawling student hangouts and lecture groups, but for honours students it would have been the only way.

  10. Medical researchers have a useful institutional addition to the toolkit, in the form of the rigorous meta-analysis. It’s a sensible starting-point for amateurs to follow the rule: “If a Cochrane meta-analysis recommends a medical innovation, it’s a good idea; otherwise ignore it”. Note that this rule cannot work for all medical practitioners, otherwise they would never innovate and there would be nothing for Cochrane to report on.

    Perhaps psychology – and economics – need their own Cochranes.

  11. Aside: The medical researchers also have to deal with the dilemma that beyond a point it is unethical to pursue certainty about an intervention at the expense of the patients it’s being tested on. You interrupt a drug trial if it’s clearly killing the patients; you also interrupt it if it’s clearly curing them.

  12. The publication bias is even worse with replication studies. My daughter was told her psych honours thesis could become a published journal article if it verified the initial study, but it didn’t so it wasn’t!

  13. As I understand it, a statistical significance of 1/20 means that if there is no difference the sample(s) only had a 1/20 chance of occurring. This tells us about the probability of those samples. Whether the result has any practical importance is not addressed. Actually it is better to think about likelihood where 1/20 becomes an odds of 1/7. And probably better still to be Bayesian except that good prior information is often not available (so it falls back to likelihood) and the mathematics becomes too difficult for most of my generation (there are still far too few people with both a sound knowledge of their area of expertise and the mathematics, both of which are really needed). Thus the 1/20 most often does no more than suggest where further work is needed, with the proviso that the subject has sufficient importance to warrant it.

    To be evidence the study needs to be repeated preferably in different areas as well as being important. A good example is inequality that produces problems both in education and health (and probably other areas as well) and this has been shown repeatedly. Another is burning fossil fuel that has been shown repeatedly to damage multiple systems such as health and the environment.

    A further problem is that so much statistics is still based on the Gaussian distribution when we are in an era where most things have some element of complexity. Complex systems have emergent behavior and tend to produce data that have very skewed distributions, even power law distributions. In decision making, complexity often seems to be ignored. A further issue is that conventional statistics requires independence. Complex things are not independent.

  14. These days, students get credit points and not money for being participants in experiments.

    It is very difficult to access participants you know. During my undergrad psych degree, during the ’90’s, the second year research methods unit required each student to find 6 participants – not students – and the data collected was given back as a data set that the group could use.

    That seems to be one way of acquiring a unique data set with enough participants from the real world to be analysed using more than a t-test.

    But finding enough participants from the population is not easy, takes a lot of time and the sort of entrepreneurial attitude that psych students don’t usually have. And seriously, doing experiments on people is ‘fraught’ you know.

    In first year psych I did the same Data Analysis course as Engineering students and other students who would require stats later in their degree. Passing this course was difficult for some of my fellow students but the ‘ability’ to do stats was a point of pride for the dept at this institute that was becoming a uni and psychs wanted to be professional like doctors.

    But those students who couldn’t pass were re-directed to a new course that would provide them students with a degree equivalent to a social work degree. There were pre-stats prep courses available also to get the math illiterates, like me, up to speed.

    I was lucky enough to have 2 sons at high school who were math wizzes and I just had to ask them things like what does this equal sign with a slanty line through it mean? They were great tutors but I still had to spend more time on the data analysis unit than I did on the other 3 units put together to get the marks that I wanted.

    I think the most interesting thing I have read about psych was by Noah Smith.

    “……….psychology itself has no unified theory, at least not yet. Cognitive and social psychology are basically pre-paradigmatic sciences — they produce a huge amount of experimental results, but they don’t fit together into any coherent whole.

    “Social psychologist Walter Mischel jokes that “psychologists treat other people’s theories like toothbrushes – no self-respecting person wants to use anyone else’s.”

    “Psychology, therefore, will be able to furnish econ with a large grab bag of anomalies, but there’s a good chance it will never provide a grand unified theory that will render the rational maximization of classical economics entirely obsolete. “

  15. “Psychology itself has no unified theory, at least not yet”
    Surely it’s worse than that. Psychology has over the years had a large number of unified theories – Freudianism, for example, and behaviourism – all of which have been proved wrong. Its methodologies – Rorschach testing, IQ scores – have been shown to stand on no foundation. Having an idea adopted as a norm in psychology is, if anything, a prima facie argument against its credibility.
    Furthermore, there’s a reason for this; as a discipline it’s never really got over physics envy and is thus unshakeably committed to reifying every personal trait or social wrinkle it stumbles over as being the outcome of a black box/Hollerith-card “human nature”.

  16. @ChrisB

    Never noticed any physics envy.

    Guess I was looking for other things including the source of the stupidity of the geology students I encountered in my honour course who refused to understand the sociology of science and then there was the intractability of the engineering students and their refusal to understand their own motivations and why they needed to employ psych students to edit their theses so that they were legible.

    If you think that ” Freudianism, for example, and behaviourism” are unified theories you are barking mad or an angry old white man who says stuff to annoy people he doesn’t like.

    Whatever dude. You not even funny.

    And you know what? The principles of behaviourism are very useful and are used in a large number of systems that need to encourage people and animals to respond in stereotypical ways for safety and stuff like that.

    And theories proved wrong? wtf – any evidence that might have ‘proved’ – how do you understand proved? – these unified theories wrong just might be wrong according to the great replication crisis and if the proof that these theories are wrong is wrong then what? They could very well be right? No.

  17. And ChrisB, it was def doctor envy not physics envy that psych students have.

    Who earns more, a specialist doctor or a physicist? Who can help people better, a doctor or a physicist? Who can prescribe medicine and diagnose people? Doctors not physicists. Psych students love that sort of thing.

  18. I recently read an article – I’m trying to remember the title of it so I can link to it – that suggested that there are inherent limits to the applicability and usefulness of quantitative, scientific methods to psychology, and that space needed to be preserved for a more literary approach to psychology (such as that taken by Freud, Jung and William James).

    It seems to me that the whole ‘physics envy’ concept is as much a case of physicists’ hubris as it is of social scientists’ envy, i.e. physicists convincing themselves that their discipline’s greater certainty is due to their own superiority, rather than the fact that their subject matter is fundamentally simpler and more predictable than the subject matter of psychology.

  19. @Tim Macknay

    When I started my degree, I enrolled in an Arts degree at a University College, that had been an Institute of Advanced Education the previous year and was a full University when I left 11 years later. At some time during the transition, the psych dept was moved to the science faculty – something to do with science not getting enough money and psych being the cash cow – and students were asked to change their degree from Arts to App Sci.

    Of course I – and I think there was another tragic – refused and so we graduated with a B.A. The pearl grey of the Arts gown was much nicer than that bright blue that App Sci offered.

    And I agree that physics is so much easier in some ways and the problems that come from the subjects of the experimentation process are human beings who are not only difficult to find and coerce into doing things for you but they also absorb into their lives the things you do to them.

    Not to mention the problem of experimenter effects which will do your head in like quantum mechanics if you try to understand it too concretely.

    I always felt unworthy of the time and effort that the participants in my PhD studies put in for me and tried to make their visit to the Uni less anxiety producing and unpleasant than I could see it was for them. My supervisor had the clout to find a Neurologist who coerced these people into coming and being tested and we all reassured each other that there really was a possibility that our research might really help them sometime in the future so it was worth it to bother them.

    And, at the same time I was feeling bad about bothering these people, I had a friend with cancer who had volunteered to participate in on-going study and had been left by the researchers to sit in the waiting room for 4 hours. He only had a few months to live, was not physically well and did not need to spend time being treated like that. 😦

  20. Psychology is fascinating. Using experiments to observe and analyse is fruitful: after all, what is the alternative? However, good experiments cost serious money, even relatively simple ones. The time and effort to get enough participants, selected carefully so as to avoid bias, and so that the test has adequate power, is far from negligible.

    I think that under the circumstances, psychology suffers from trying (too hard) to look like one of the hard sciences like physics, when it patently isn’t like that. In physics, which also suffers from negative results being difficult to publish, a physicist usually has some theoretical jumping off point for a given experiment, meaning they have considerable chance of getting a positive result. There are experiments which are more of an observe and analyse variety, but even there the theory tends to catch up and lend support in a potent mathematical logic way to the result. Psychology isn’t in that kind of situation, and so should expect a lot more experimental studies to have null results; if the study is a properly composed one, and the hypotheses worth examination, it should really be published, IMO.

    Psychology experiments must be like trying to herd cats, I reckon. All the complexities of sentient beings are present, and somehow this must be corralled into something meaningful. Definitely not easy.

    I think the fusion of psychology with neuroscience and biology is one of the most amazing developments of the late 20th century, and will keep a curious eye on what comes from this. This is one corner of psychology in which a theoretical foundation just might be possible, especially as the detailed behaviour of neurons (and other brain cells) are determined. Psychologists can choose to study anything with a brain, basically, or even just a nervous system. By applying some reductive steps and studying the simplest set-ups (say the nematode), psychologists finally have a real chance at uncovering how nervous systems organise and operate. I hope I live long enough to see how far this program can go 🙂

  21. Very nice. I hadn’t thought about it like this before. I guess that is why one study should not completely sway us. And I guess we apply our own version of a Bayesian approach by looking at new studies in the light of whats gone before, and what we already think.

    If there is one thing I know about statistics, its that you must be extremely careful in your interpretation. I like the idea of generating the data randomly, and then analysing it and seeing what you get.

  22. I prefer the “Monkey Magic” theory of human psychology.

    Monkey represents thought, intellect and rationality. Thought is nimble, agile, irrepressible, mischievous and amoral. It is also arrogant. Sandy represents our murderous-ness and Pigsy our gluttony and lust.Tripitaka stands for spirituality, morality and wisdom. Tripitaka seeks to control the others and lead them together to the correct goal. It’s all there. What can psychology add? 😉

    “In the worlds before Monkey, primal chaos reigned.
    Heaven sought order.
    But the phoenix can fly only when its feathers are grown.
    The four worlds formed again and yet again,
    As endless aeons wheeled and passed.
    Time and the pure essences of Heaven,
    the moisture of the Earth,
    the powers of the sun and the moon
    All worked upon a certain rock, old as creation.
    And it became magically fertile.
    That first egg was named “Thought”.
    Tathagata Buddha, the Father Buddha, said,
    “With our thoughts, we make the world.”
    Elemental forces caused the egg to hatch.
    From it then came a stone monkey.
    The nature of Monkey was irrepressible!” – Monkey Intro.

  23. @Donald Oats

    And at another corner of the things psych can and will investigate, is the science of psychogeography that “looks at how strongly the place we’re in can influence how we think and feel.” From the quantum level to the universal, psych just wants to understand the patterns that are to be found and why these patterns in human behaviour can look like fractals?

    I don’t like to think of the not-theories as toothbrushes though. I like to think of them as glass beads some of which are connected by conceptual/narrative ‘threads’; for example effects like Dunning-Kruger and Motivated Cognition are quite robust and the patterns of human behaviour that can be seen in the description of these cognitive processes can be linked to the patterns that Freud saw in his environment. Some of Freud’s stories about why people do what they do are wrong in our environment because our environment has changed.

    Anyway, about ‘psychogeography’; if one thinks about the sense of place that is so strong in aboriginal cultures and in the other people who live on ‘the land’ – although the neo-liberal impulse to get ahead has overruled that ‘instinct’ in some of these people and they will sell anything if the price is right – this is another way ‘psychology’ can be used to create links – stories – that make it possible to understand what we do.

    It’s on Sunday Extra.

    The most valuable knowledge that I personally gained from my undergrad psych course and this is so indulgently off topic – was that one could ‘research’ things – almost anything that that was before the internet which developed while I was at Uni. But mostly it became obvious how complicated and fundamentally unable to be understood fully life is but that it doesn’t matter if you keep aiming for the truth/objectivity even if it is ‘just a horizon value’.

    So, that is probably why it seems to me that intro psych and basic stats could be like the Arts degree once was and provide a set of skills that are very useful for negotiating one’s way through the diversity that is human nature/culture.

    Ikon did you watch the TV series The Samuri with Shintaro the ninja?

  24. @Donald Oats
    @Donald Oates

    Psychology experiments must be like trying to herd cats, I reckon. All the complexities of sentient beings are present, and somehow this must be corralled into something meaningful. Definitely not easy.

    Cats are easy. 🙂

    A friend of mine was doing a Ph.D and wanted to do some research with school kids in school settings Ten-year olds IIRC . Nothing to it.

    Get usable idea, sell idea to committee/advisor

    Find some funding

    Get approval from the ethics people

    Prepare testing and communication material

    Negotiate with local school board at staff and board levels. Remember you will need your advisor to sit in on several (most?) meetings

    School boards usually only meet once a month so hope your pitch is good the first time.

    Individually contact parents of all suitable childen and get written permission for their participation. Pray enough children are available.

    Run study–remember you probably only have perhaps 4 or 5 months in a year to do the data gathering.

    Work around sports and cultural activities at the several schools.

    Hope that children will cooperate and no flu epidemics strike during the data gathering periods.

  25. The way I remember it, it should be

    The nature of Monkey was —

    But I always wondered:
    How does an eon wheel?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s