I’ve been reading Steven Poole’s Unspeak and he observes that having introduced a five-level color coded terror alert, the government has never used the top level (red) or the bottom two levels (blue and green). The obvious reason is that a red alert would require some specific action, while a move to a blue or green level would imply that there was some prospect of the War on Terror actually ending.
I’ve noticed much the same phenomenon with 5-point grading scales for worker performance, such as those used in the Australian Public Service for a while. A top score suggests a requirement for some kind of substantial reward, so these are rare, while a score of 4 or 5 implies a need for counselling and a possibility of dismissal. So just about everyone gets a 2 or a 3, yielding, in effect, a two-point scale.
I imagine someone in psychometrics must have studied this kind of thing in general. Any pointers?
Update James Joyner at Outside the Beltway made the same point a couple of years ago.
Yet further update One day after I posted this, the Red Alert level has finally been used, but apparently only for commercial flights from Britain to the US, in response to the announcement by British authorities that they have detected a terrorist threat to blow up planes.
I have the same feeling to those 1-5 or 1-10 “How did we do?” ratings you have to fill in after a training course. Zero or one means you have to explain why they did so badly and how they could improve it (and at the end of the course I just want to get out of the place); 5 or 10 means they were perfect (and nobody is) so my answers are always 50% plus or minus one or two.
I studied human resource management at uni and recall the tendancy to score in the middle as being an issue in performance appraisal, along with recency (the score is biased towards the most recent interactions rather than balanced over the full appraisal period), and subjectivity (assessment of work performance being clouded by personal like or dislike for the individual). I think “Middling” is mostly attributed to supervisors and employees paying lip service to a process they see as forced on them by the organisation. When there is no real engagement in the process during the appraisal cycle, there is no justification for an unusual result at the end.
I know that I such phenomena have been mentioned on multiple occasions during my psychology lectures, and I know I actively wondered whether it influenced my results in my third year psych project last year, but I cannot remember the name for the phenomenon. Something like “edge avoidanceâ€?. But I’ve never attended enough psychology lectures.
Some U.S. universities have a 5 level grading system (A, B, C, D, and F; no pluses or minuses) but the vast majority of grades end up being A or B.
Although this 2-level grading seems to be inefficient, if you have enough draws of a dichotomous system you can distinguish students fairly finely. That’s the theory, anyway.
Also, along the same lines as Aidan’s point: students’ evaluations of university teaching/teachers. There must be some analysis of this case somewhere around, but I have no references handy.
I did not keep any of my texts from uni but googled this just now, “Understanding Performance Appraisal: Social, Organizational, and Goal-Based Perspectives” by Kevin Murphy and Jeanette Cleveland and published by Sage. The tale of contents sounds about right and I would think there is a lot of literature in the HR field about the the weaknesses as well as strengths of various appraisal systems. Not sure about other fields but I agree there would be similar material about student evaluations, training session feedback, etc.
We just had the new report cards for Victorian primary schools arrive – and the bad news is that it is now a 1 point scale. My 6 year old daughter came home with a report card with everything marked ‘C’. Which apparantly now means that she is at the right stage of development for a six year old in all subjects (social, literacy, numeracy, art, PE). What a bland report.
My 9 year old son then returned home with an identical report card – everything a C.
I can tell you – my kids are very different. My son is one of those maths types – but hates art and writing. My daughter’s idea of a fun afternoon is getting the paint set out, or drawing all the characters from the latest book she’s read.
My daughter was really upset with her report – couldn’t understand why she didn’t get a good mark for art, and why her mark was the same as her brothers.
Swapping notes with friends – it now turns out that just about everyone got C for everything at school. Because by definition – C now means development in-line with age group.
What a waste of time that report card is.
Perhaps the higher terror codes are reserved for really difficult elections. At the end of the day preparing for random acts of violence is pretty difficult. I tend to rely on statistics so that I can sleep at night. The flu is a more likely source of death.
Do we need a five point warning system for impending ecological disaster to take our minds of the islamofascists? Or should I just live in fear of pedestrian crossings?
Who benefits from all this fearfulness anyway?
I wouldn’t necessarily knock two-point scales: the binary is the elixir of modern life.
The real trouble occurs when the binary collapses into a de facto one-point scale, as Andrew comments re Victorian primary schools’ new report card “system”.
Another example here is Labor vs Liberal: they kid themselves (and a majority of the electorate, it would seem) that *they* are a 2, and the other is a 3 (out of five, where 1 is best), but they’re actually indistinguishable, IMO.
The real problem with these things is where someone steps out of line. I have been involved in a process where six people had to score candidates on a scale of 1-7 for 16 criteria. There were eight serious candidates and most of the markers gave them 6 or 7 for each criteria – reserving the low marks for some people who really weren’t in the race.
However, one marker had no problems throwing around 3s for a couple of candidates he didn’t like, and it resulted in a deeply skewed outcome.
Having looked at a few different rating scales for satisfaction with services, quality of life and so on, I’ve noticed that the mean (and median) score is almost always around 75% of the scale maximum.
This leads me to the theory that: ‘on average, people are three-quarters happy with everything’. Or put qualitatively ‘people are generally more positive than not, but always think they could do better.’
About performance appraisal in HR: the issue isn’t the scale but its application. Application is subjective and appraisers are not trained in applying the performance criteria. Further to that, the ratings may be adjusted to fit some external constraint. I recall a short period in a public service environment where performance bonuses were based on a 1-5 rating. The bonuses were set at a particular level according to rating. Unfortunately, the total amount of funding was also set, and at a level that didn’t allow for many good performers. The actual results were ‘adjusted’ downwards to fit the funding. Laugh or cry – your choice.
Not long ago I worked on a client satisfaction survey for a non-(english)-literate clientele. This is a different circumstance, but demonstrates the same sort of compression as individual performance rating scales. In previous years, a five-point scale had been used, but we opted for a three point scale (good, ok, bad – using images). It turned out that the results were much the same – most people were ok with the service. On the five-point scale, (5 great, 3 ok, 1 unspeakable) the results were (approximately)
1 = 1%
2 = 26%
3 = 52%
4 = 20%
5 = 1%
(BTW, this translates as 73% of clients were satisfied)
On the three-point scale the results were (approximately)
Bad = 23%
Ok = 51%
Good = 26%
(which translates as 77% of clients were satisfied)
The results were in successive years, but the clientele is fairly stable and the timing probably had little impact on the result.
This fits in well with Stephen’s quote above – ‘on average people are 3/4 happy with everything’ and also, most people don’t go out of their way to be hurtful even if it isn’t all perfect.
How strange that you post this today and – bingo! – the first US RED alert is issued! Are you being watched, man…??? 🙂
How strange that this RED alert – and the major commotion in the UK – come at a time when the civilized world is aghast in horror at the seemingly unstoppable Israeli massacre of Lebanese civilians. How strange that Bush and Blair, already so desparately unpopular in their own countries, refuse to even call for a ceasfire! How bizarre that fellow politicians, public figures and the media are not far, far more critical of such an unprecedented, barbaric stance!
I mean, it is enough to drive a man to Conspiracy Theories, isn’t it?
In totally un-related news, Antony Loewenstein is not as stupid as Ted Lakin thinks he is (link).
And Jose Padilla, the so-called ‘dirty bomber’, was framed.
Normal programming will not be resuming anytime soon, folks… We are all in La La Land now.
Pr Q, You can kiss your two-point scales theory goodbye. Oh, I see you already have. Fast work.
Umm, there really are Islamo Fascists out there who want you (us) dead.
You say that wars like Iraq and Afghanistan are creating terrorists. But when those terrorists you refer to get busted trying to kill us, you say that those terrorists suddenly don’t exist. Curious.
You can’t have your cake and eat it too.
The Hawk’s militarist (regime changing) policies overseas are creating a terrorist recruiting pool in Islamic lands. But so are the Wet’s culturalist (diversity hustling) policies at home.
A plague on both their houses.