Magic Town

Data Science / news / Open Data

In the 1947 film Magic Town, a former basketball player starts his own market research firm. New to the business, he doesn’t have the means to compete with his larger more established rivals, so he’s intrigued to learn of a survey carried out in a small town called Grandview that exactly matches the result of a national survey carried out by one of his wealthier rivals.

In our work at ASI, we are often asked questions along the lines of “what does our typical customer look like?”. We believe that questions like this over-simplify the diversity which can exist in a customer base. Thinking of England as whole, if you tried to sum up the average person in England, they would be about 42 years old, white and female, because whites are the dominant ethnic group and there are slightly more women than men. But of course, this is no way captures the diversity found within England.

A better way perhaps, to think about this problem, is to imagine walking down the street and observing that only 1 in every 10 people you passed was male, or everybody was over the age of 70. Clearly, you would not be in a typical area. In this piece of work, we’ve set out to find a small area of England which reflects the diversity of the country as well as possible. Such an area would reflect a microcosm of English life, somewhat like the magic town of Grandview.

In terms of how large an area to find, we wanted to select an area small enough that one could feasibly walk around it in a few hours, but large enough to deliver robust results. The smallest statistical unit reported by the Office for National Statistics is an“Output Area” (or OA), which consists of about 4 postcodes. We decided in the end to go for the second largest type of area, the Lower Super Output Area or LSOA in order to pick up less noise. There are 32,844 LSOAs in England, each of which comprises on average 44 postcodes. The median population of an LSOA in England is 1564.

A key concept in our work is the difference between the mean and the median. For example, in the 2011 UK census, 81% of the population identified as White British or Irish. Because however, ethnic minorities tend to be concentrated in urban areas, your typical UK LSOA would not have an 81% white British/Irish population. While there is some degree of subjectivity in how you define this, for the example of white British/Irish, we look for an area which has a fraction of white British/Irish people living there, such that half of the people in England live in areas with higher concentrations and half in areas with lower concentrations. We refer to this number as the half-population median. For census variables with skewed distributions, this can differ substantially from the mean. If you live in an area in which 91.3% of the population identified as white British/Irish in the 2011 census, half of the population of England lives in areas with both higher and lower fractions.

We then set out to calculate the half-population median for a number of census variables, specifically: age distributions (i.e. the fractions of people aged 0-4, 5-9 etc.), ethnicity, educational attainment, property ownership, marital status and types of job. All of these are documented in the census. For a variable such as household income (clearly very important when looking for typical areas), no reliable surveys tell us what the average household income of all 32,844 LSOAs are. The most detailed ONS surveys give estimates of household income for each parliamentary constituency in the UK and for each Local Authority. Both parliamentary constituencies and local authorities are much larger than LSOAs so there’s no failsafe way of getting at average incomes for LSOAs. Instead, by examining the demographic differences between local authorities and differences in income, we learn how income is related to census variables and using this knowledge, model the likely average household income for each LSOA. We can perform a similar procedure for other variables which are reported at Local Authority or Parliamentary constituency level, such as house prices, Euroscepticism (i.e. the result of the referendum) and political partisanship (the result of the last general election).

Consequently, for each of the variables described, we calculate the half population median and then for every LSOA in England, we determine how far away it is from each variable’s half-population median. Then, finally, for each LSOA we combine how far away it is from the HPM for each variable into a final number which in effect tells you “how median” it is across the board. There is a touch of subjectivity in how you perform the final part of the procedure. We chose to penalise more heavily, an area which scored extremely well on most metrics but very badly on a few, over an area which perhaps didn’t score extremely well on any metrics but crucially did not score particularly badly on any (for those more mathematically inclined, we worked with something more akin to the average of squares rather than the linear average).

From our research, the top 5 median areas are:

Wessex Road, Didcot, Oxfordshire

Winslow Ave, Droitwich Spa, Worcestershire

Bath Road, Worcester

Southwick

East Leake, Nottinghamshire

Cover photo by Mark Chatterley (CC BY 2.0)

Share on: Twitter, Facebook, LinkedIn or Google+