Reinventing the wheel in social network theory
I was thinking idly about Erdos numbers, and it suddenly struck me that I could easily prove the necessity of a couple of ‘stylised facts’ about the associated networks. It’s well-known that the collaboration network for mathematicians contains one big component, traditionally derived by starting with Pal Erdos. The same is true of the network generated by sexual relationships. Although there is no generally agreed starting point here, it is a sobering thought that a relatively short chain would almost certainly connect most of us with both George Bush and Saddam Hussein.
Anyway, the thought struck me that, given a simple two-parameter model, I could prove (at least in a probabilistic sense) not only the existence of a large component but its uniqueness. One parameter would characterise the distribution of the number of connections made by each person, and the other would characterise the bias in favor of endogamy or exogamy. Provided, in an appropriate sense, that these parameters multiplied to a number greater than 1 for some large segment of the population, a network with a starting point in that segment would expand until it contained a substantial portion of the whole population.
It’s easy to see then, that there can’t be two large components (where large means, say, more than 100 members and more than 10 per cent of the relevant population), because the probability that at least one of the possible connections (more than 10 000, by assumption) will be made approaches 1.
I’m recording this not because I think it’s a new discovery, but to raise a general point about research strategy in theoretical problems. The recommended strategy in most fields is to acquaint yourself thoroughly with the literature, then work out what new contribution you might be able to make. My preferred strategy is to begin with only a cursory knowledge of the field in question, work out how I would answer a question of interest and only then consult the literature.
The disadvantage of this approach is that you spend a lot of time reinventing wheels, since most questions of interest have already been answered in one way or another. The advantages, though, are substantial. First, it’s easier to understand something you’ve worked out for yourself than something you’ve read by somebody else. Second, in most research topics, the literature bears the marks of its history. What this means is that the substantive theoretical insights are inextricably mixed with accidental effects. If Professor A, the author of the first big paper in the field, thought that axiom X was crucial and axiom Y was uncontroversial, it’s likely that axiom X will continue to get a lot of attention, whether or not its justified, and that anyone who questions axiom Y will be regarded as ill-informed. If you come to the problem afresh, you may see it differently (not necessarily a good thing if you want to publish lots of articles in journals edited by the students of Professor A, but if you already have tenure this isn’t such a problem).
fn1. This is economist jargon for things we think are true but for which we have no solid evidence
Anyway, I’d welcome anyone pointing me to where my results have been anticipated, as well as any thoughts on research strategy.