# Survivor (also at CT)

The question of disciplinary boundaries is a perennial, and Brian Weatherson’s CT post on Richard Gott’s Copernican principle provides yet another instance. Gott, an astrophysicist, is interested in the question of whether you can infer the future duration of a process from its present age, and this issue seems to received some discussion in philosophy journals.

It may be beneath the notice of these lofty souls, but statisticians and social scientists have actually spent a fair bit of time worrying about this question of survival analysis (also called duration analysis). For example, my labour economist colleagues at ANU were very interested in the question of how to infer the length of unemployment spells, based on observations of how long currently unemployed people had actually been unemployed. The same question arises in all sorts of contexts (crime and recidivism, working life of equipment, individual life expectancy and so on). Often, the data available is a set of incomplete durations, and you need to work out the implied survival pattern.

Given a suitably large sample (for example, the set of observations of Broadway plays, claimed as a successful application of Gott’s principle) this is a tricky technical problem, and requires some assumptions about entry rates, but raises no fundamental logical difficulties. The problem is to find a distribution that fits the data reasonably well and estimate its parameters. I don’t imagine anyone doing serious work in this field would be much impressed by Gott’s apparent belief that imposing a uniform distribution for each observation is a good way to go.

Of course, social scientists tend not to like working with a sample size of one, so the Copernicans have a bit more room to move in unique cases. Still, if you are willing to assume a functional form for your probability distribution, and there’s only one free parameter, you can calculate a maximum likelihood estimate from one data point. The arbitrary choice you make determines the confidence interval.

In Bayesian terms, picking the ML estimator is (broadly speaking) the equivalent of assuming a diffuse prior. The big problem in the Copernican approach is this assumption, which is, in effect, that you have no relevant information at all, except for your single sample observation. If the problem is of any interest at all, this assumption is almost certain to be wrong. Take the example of the likely duration of the space program. We can, at the very least, observe that NASA and its competitors have missions scheduled for years ahead, which makes very short durations much more unlikely than those derived from a uniform distribution (Brian’s examples also made this point).

The real lesson from Bayesian inference is that, with little or no sample data, even limited prior information will have a big influence on the posterior distribution. That is, if you are dealing with the kinds of cases Gott is talking about, you’re better off thinking about the problem than relying on an almost valueless statistical inference.

## 8 thoughts on “Survivor (also at CT)”

1. or, a pd of data is worth a ton of guesses?

2. Bill O'Slatter says:

Given a suitable law of large numbers for independent and identically distributed random events e.g. the Central Limit Theorem we can say that if we sample the population of interest the mean of those samples will tend to the population mean at some probabilistic rate, and that the distribution of those sample means is normal. There are more modern laws of large numbers than the Central Limit Theorem. Since in this case the random event is the duration of an event we have to sample the duration( we can’t infer the duration since we don’t have the distribution of durations).

3. or, a kg of data is worth a tonne of guesses, wrapped in technobabble.

4. Success in logging in, even using my home equipment (though the idiot blog software still doesn’t have a no-images alternate text).

This reminds me of much that I came across doing my maths. BA, particularly estimators. It’s the sort of stuff Cheney was fumbling with when he was talking about known and unknown unknowns. For instance, if you get independent proofreaders to check out a book – or to test software, come to that – and then you classify errors according to whether they were discovered by 1, 2, …, n, … tests, you can get a distribution and read off (well, not quite read off) the number of errors found zero times, i.e. not found at all. With the right tests you can also estimate how material that is, too, and determine what to do next – known unknowns. I have somewhere heard that someone once had the bright idea of applying this to crime statistics, and the results were hastily suppressed when it showed just how many people were probably literally getting away with murder, not so much that the crimes weren’t being solved as not even being observed.

Here’s an example from the Second World War of just how beautiful and intriguing this area can be. Bombers were being damaged, and it was decided to armour them. The catch was that full armour was too heavy, so the designers had to choose what to armour. Someone had a brainwave; he superimposed layouts of damage from many returning bombers, to see where they had been hit. Did they armour the damaged areas? No! The inference they drew was that they were only sampling bombers that made it back, that the damage they observed was survivable and that the areas that weren’t damaged were really being damaged in the inferred but not directly observed sample. So they armoured the parts that appeared not to get damaged. The whole approach is a little like working out the shape of a missing dartboard from the pattern of holes in the wall where it used to be.

There’s more, that you can use to see through dirty economic rationalists. Did you know that statistics prove that average wages of hand weavers actually rose all through the nineteenth century? Does this mean that we’ve been listening to propaganda about the plight of the weavers after machine weaving came in? It does no such thing. It’s survivor bias, because the average is taken among current weavers and does not factor in what happens to those who suffer. It’s not usually as obvious as shooting the sick to improve health, but it gets close (like having early releases from hospitals to improve their success). When economic rationalists tell you that (say) New Zealand farmers are better off because of reforms, well, now you know better than to believe them (or to disbelieve automatically – even if the reasoning is wrong the results might still be right on the stopped clock principle, so you still have to sort things out and assess them). Mind you, the people spouting this nonsense do believe it; they just kept asking until some answers pleased them, then they trotted them out.

In fine, statistics and quantum physics have a lot in common: common sense does not work here (did your common sense tell you to armour the parts of the bombers that were damaged?).

5. Bill O'Slatter says:

That was Rumsfeld ( not knowing the unknown unknowns).The random variables in the case of bombers would be some proxy of damage per unit area on the bomber.e.g.size and number of holes per unit area or total area of holes per unit area. This variable would be right censored as you would not see above a certain level of damage( the plane crashed). As you state the point at which the random variable is right censored would be the estimator of interest.

6. Bill O'Slatter says:

Addenda : you would have to list the assumptions in the above i.e going from the statistic to the optimal armouring of the plane. First step would be to check the likely Poisson distribution of the damage in the unit areas which are hypothesized to be independent and identically distributed.

7. mugwump says:

In Bayesian terms, picking the ML estimator is (broadly speaking) the equivalent of assuming a diffuse prior.

Not really. For any reasonable likelihood function a single sample and diffuse prior will yield a diffuse posterior (as does overeating and lack of exercise (boom boom)).

Picking the ML estimator is more like summarizing the posterior with a delta-function – reasonable from a Bayesian viewpoint if you have a large sample and a unimodal posterior, but otherwise there’s no real Bayesian interpretation.

8. I suspect the bomber armour problem used an even less formal approach, because they started from yet another constraint: they had so much weight of armour to add. They would have chosen the total area according to a priority based on least measured damage, and not bothered to work through the details of cutoffs from estimators at all.