Walkability via Census

August 18th, 2014
geography, maps, transit
If you want to know how walkable various parts of the country are, and you only have (1) the census and (2) walk scores for DC, what could you do? Well, you could look for census variables that correlate with walk scores in DC and then extrapolate these correlations to the rest of the country to get a national map:

This is pretty cool! But let's have a look at these correlations:

% 16 and older employed in civilian labor force 0.636261
% 25-34 years old 0.630423
% females 16 and older employed in civilian labor force 0.594394
% 16 and older in civilian labor force 0.578812
Nonrelatives in household 0.561472
% 16 and older in labor force 0.545052
% 16 and older commuting to work by other means 0.542523
% females 16 and older in civilian labor force 0.532800
% 18 years and younger 0.530061
% born in state of residence -0.529451
houses built 1939 or earlier 0.528639
workers 16 and older commuting to work by other means 0.527117
workers 16 and older walking to work 0.525601
% females 16 and older in labor force 0.522900
% with at least a bachelors degree 0.522021
% 10-14 years old -0.519463
% 16 and older driving to work alone -0.507589
population 25-34 years old 0.507211
% 16 and older not in labor force -0.502902
The first thing that jumps out is that I really wish they had run a round of principal component analysis to cut down the number of variables. Many of these are probably just multiple ways of saying the same thing:
  • employment
    • % 16 and older in labor force
    • % 16 and older in civilian labor force
    • % 16 and older employed in civilian labor force
    • % 16 and older not in labor force
    • % females 16 and older in labor force
    • % females 16 and older in civilian labor force
    • % females 16 and older employed in civilian labor force
  • population, after considering %-version of the same metric
    • population 25-34 years old
    • workers 16 and older commuting to work by other means
  • age
    • % 10-14 years old
    • % 18 years and younger
    • % 25-34 years old
    • population 25-34 years old
  • commuting
    • % 16 and older driving to work alone
    • workers 16 and older walking to work
    • % 16 and older commuting to work by other means
    • workers 16 and older commuting to work by other means
What I'm saying is that if you ran PCA to look for correlations between these different variables, I strongly suspect you'd find that after taking into account new variables representing employment, population, age, and commuting the remaining related variables would provide very little additional information. But I don't have easy access to the raw data, so I'll be lazy and just approximate this by taking the variable in each category with the strongest correlation and just keeping that one. (We would expect that after PCA our new super-variables would have a bit better correlation, but still it would be about like this.):
% 16 and older employed in civilian labor force 0.636261
% 25-34 years old 0.630423
Nonrelatives in household 0.561472
% born in state of residence -0.529451
houses built 1939 or earlier 0.528639
workers 16 and older walking to work 0.525601
% with at least a bachelors degree 0.522021
Some of these variables, like people walking to work or whether houses were built before cars became popular, do seem very likely to correlate with walkability wherever you are in the country. On the other hand, most of the other variable instead look to me like they're measuring "who in DC lives in walkable areas". We see a bunch of employed college-educated young people from out of state living together. I can totally believe that the more like that someone is the more they're likely to value walkability in choosing where to live, and the more likely they are to be able to afford it. But if we're trying to identify walkable areas in the rest of the country this is going to miss out on a lot of other ways a place can be walkable.

This actually surprises the author somewhat; after taking affordability into account and seeing what places had the best combination of walkability and affordabilty by their metric they wrote:

I was expecting something like a smaller, affordable Midwest town or something, but it the highest scoring areas were usually just outside of major downtown
It looks to me like what's going on is that the kind of people who want to live in the walkable parts of DC are not the kind of people who want to live in smaller Midwest towns, even if those towns are super walkable.

(On the other hand, this might actually be a better metric for the author's purposes. They're really trying to figure out where they would enjoy living, and they think they want walkable neighborhoods. But having a lot of people 25-35 and a lot of strangers living together, along with high employment and education, are probably also things they care about. A retirement community in rural Tennessee might be both very walkable but it's probably not what they had in mind.)

Comment via: google plus, facebook

Recent posts on blogs I like:

Thoughts on EA Funds

Hopefully helpful feedback

via Thing of Things April 16, 2024

Clarendon Postmortem

I posted a postmortem of a community I worked to help build, Clarendon, in Cambridge MA, over at Supernuclear.

via Home March 19, 2024

How web bloat impacts users with slow devices

In 2017, we looked at how web bloat affects users with slow connections. Even in the U.S., many users didn't have broadband speeds, making much of the web difficult to use. It's still the case that many users don't have broadband speeds, both …

via Posts on March 16, 2024

more     (via openring)