Data is not the plural of datum
In a recent comments thread, Dave Ricardo notes my observation
Unfortunately the data is only published annually
Don’t you mean, the data are published only annually?
This is a common error, reflecting a confusion between Latin and English.
In Latin, data is the plural of datum (‘something given’). The word ‘datum’ is used in English, but is an archaism, except for a specialised use in surveying. On standard principles of modern English usage, the correct plural of datum is ‘datums’ and a Google search reveals 158 000 occurrences of this term. (For comparison, ‘data’ occurs over 100 million times).
In English ‘data’ is a mass noun like water or wheat. Hence it can be used in compounds like “data base”, “data processing” and so on, which would be ungrammatical for a plural like ‘datums’. This simply reflects our everyday experience with data which, is that it is a quantity of information, not a collection of facts. For example, it would be natural to refer to “500Kb of data”, but wrong to refer to “500 data”.
Data is normally dealt with as a mass, but it is sometimes important to refer to discrete units, in which case it is appropriate to use the count nouns ‘data point’ or ‘observation’ (drops of water and grains of wheat provide an analogy for other mass nouns). A collection of data points can be referred to as a ‘data set’.
Although lots of people imagine that ‘data’ is a plural count noun, and some try to treat it that way, hardly any do so consistently. To give just one example, I looked for occurrences of the phrases ‘not much data’ (correct for a mass noun) and ‘not many data’ (correct for a count noun) using Google. There were 10 times as many occurrences of ‘not much data’. Moreover, a large proportion of the ‘not many data’ observations were either written by non-native speakers of English or formed part of grammatically correct phrases such as ‘not many data sets’.
Update There is nothing new under the sun. Kevin Drum at Calpundit blogged on the same topic a few months ago, reaching the same conclusion.