{"items": [{"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811292211782", "anchor": "fb-811292211782", "service": "fb", "text": "Brandon Martin-Anderson Ira Winder", "timestamp": "1473796162"}, {"author": "Danner", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811292211782&reply_comment_id=811298194792", "anchor": "fb-811292211782_811298194792", "service": "fb", "text": "&rarr;&nbsp;\"Brendan Martin-Anderson\" in your article, Jeff.", "timestamp": "1473799104"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811292211782&reply_comment_id=811299097982", "anchor": "fb-811292211782_811299097982", "service": "fb", "text": "&rarr;&nbsp;Huh; I got that spelling from https://github.com/meetar/dotmap<br><br>Will fix", "timestamp": "1473799268"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811292211782&reply_comment_id=811299362452", "anchor": "fb-811292211782_811299362452", "service": "fb", "text": "&rarr;&nbsp;Fixed", "timestamp": "1473799334"}, {"author": "Michael", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811293329542", "anchor": "fb-811293329542", "service": "fb", "text": "(This comment probably just comes down to 'situations should not be described with just one number')<br><br>The difference between a dense city in which all the land is densely occupied and a \"dense\" city in which a little bit of the land is densely occupied is that you can easily accommodate a lot more people in the latter without changing the median person's experience much.", "timestamp": "1473796903"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811293329542&reply_comment_id=811293648902", "anchor": "fb-811293329542_811293648902", "service": "fb", "text": "&rarr;&nbsp;Yes, I think that if you're interested in land usage the traditional statistic is better.", "timestamp": "1473797183"}, {"author": "Ben", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811293613972", "anchor": "fb-811293613972", "service": "fb", "text": "This is fascinating. I would love to see the differences between dense, northeastern cities and more spread-out midwestern and southern cities and see whether the data show anything different than standard density measures.<br><br>Out of curiosity, is it more computationally difficult to do states with more people, more area, or more census blocks?", "timestamp": "1473797141"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811293613972&reply_comment_id=811298394392", "anchor": "fb-811293613972_811298394392", "service": "fb", "text": "&rarr;&nbsp;It's pretty much linear in everything, with people being the most expensive factor. More area per person makes it cheaper since it can ignore more people.", "timestamp": "1473799130"}, {"author": "Richard", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811298564052", "anchor": "fb-811298564052", "service": "fb", "text": "The census defines a similar metric they call population-weighted density, which is defined as the weighted average of each census tract's density, where each census tract is weighted by its population. In other words, it's what you'd get if you asked each person to report their personal census tract's density, and then averaged those numbers. This article* (http://www.citylab.com/.../americas-truly-densest.../3450/#) discusses some of the findings a bit.<br><br>It does seem a bit different from your metric of anthropic density, but I'm not sure which one is better. In big cities, census tracts are usually a bit under a quarter-mile in radius, so that seems about equal in both. The Census' metric may be inaccurate for people who live near the border of two census tracts with significant differences in density, as they'll be lumped in with people on the far edge of their own tract but not people in another tract who may be much closer in absolute distance. On the other hand, sometimes census tracts carry important information that is lost when you look at absolute distance; perhaps a highway, river, or canyon, or just neighborhood culture, makes it such that places nearby by absolute distance are less relevant than those farther away but still part of your neighborhood. Similarly, the census tract method probably works better for those next to a body of water: someone in a skyscraper on the waterfront will be counted by the absolute distance metric to have about half their quarter-mile circle empty, but for most practical purposes I'm interested in the average population per square mile of the landmass, not the land+sea radius around me. Anthropic density would undercount those cases by about half.  (You could solve this by dividing by landmass within a quarter-mile of each person, but I don't think you did that and it sounds challenging to find that data and to process it for each person). Seems like there are pros and cons to each.<br><br>The other issue may be ease of computation; this method took 3 days for MA, while the census method seems like a very straightforward weighted-average computation and besides, it's already been computed: there's a download link (xls) for 366 metro areas at http://www.census.gov/.../CBSA%20Report%20Chapter%203...<br><br>* Note that the article mis-defines the metric, stating that it's the average of the census tracts but not mentioning the weighting: the definition box on page 23 of the original report (http://www.census.gov/prod/cen2010/reports/c2010sr-01.pdf) makes it clear that it's population-weighted, as it should be.", "timestamp": "1473799160"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811298564052&reply_comment_id=811321842402", "anchor": "fb-811298564052_811321842402", "service": "fb", "text": "&rarr;&nbsp;* Census tracts are too big for what I want: small towns are often only one tract, for example, so the difference between Chester MA and a similar town with no town center wouldn't be visible<br><br>* A skyscraper next to the waterfront has fewer people near it that an equivalent skyscraper with city on all sides, so I think describing it as less dense is fits what I'm going for. See the discussion of Nahant, Hull, and Winthrop in the post.<br><br>* It took three days because I stopped optimizing once I got to something fast enough for my purposes. It could be made much faster with a better implementation, and I may rewrite it in C. Furthermore, this computation time is trivial compared to the time it takes to actually compile the census data.", "timestamp": "1473807092"}, {"author": "Richard", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811298564052&reply_comment_id=811353798362", "anchor": "fb-811298564052_811353798362", "service": "fb", "text": "&rarr;&nbsp;* Makes sense, yeah. If census tracts are too big then they don't distinguish between those two cases.<br><br>* Seems like the kind of density metric you want really depends on what you want to know. For something like \"how many people live within walking distance of my store/train station/whatever?\", then yeah, any water should count as a zero and bring down the density. But for questions like \"how tall are the buildings?\", \"how likely am I to run into another person?\", etc. then the water means that  yes, there's half as many people within a certain radius, but there's also half as much land to spread them out on so you probably want to only divide by the land area. And it does get a bit wonky with really thin, curvy peninsulas no matter what.", "timestamp": "1473821820"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811298564052&reply_comment_id=811396991802", "anchor": "fb-811298564052_811396991802", "service": "fb", "text": "&rarr;&nbsp;I think for \"how tall are the buildings\" you want traditional density, since you care about land usage. One tall building surrounded by short houses still only counts as one.<br><br>For \"how likely am I to run into someone\", anthropic density with its ignoring water seems right: if instead of waterfront there were more skyscrapers then there would be even more people to run into.", "timestamp": "1473852089"}, {"author": "Michael", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811298564052&reply_comment_id=811408548642", "anchor": "fb-811298564052_811408548642", "service": "fb", "text": "&rarr;&nbsp;Jeff&nbsp;Kaufman --but if you eliminate all gathering places such as waterfront, replacing them with residential buildings, you get something like high-density suburbs, or high density urban housing projects, and do t \"run into people\" because there's no interactive space (not counting mass transit, where you are aware of people but can't interact with more than a couple of them).<br><br>Mind you, few areas are developed to this extreme.  I'm just suggesting that your curve should do something funny at the upper end, when looking at \"number of people you interact with\" as one of the axes.", "timestamp": "1473857689"}, {"author": "Jameson", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811298564052&reply_comment_id=811411512702", "anchor": "fb-811298564052_811411512702", "service": "fb", "text": "&rarr;&nbsp;Michael, streets would still exist in that scenario, and stores and pubs, which are where people tend to run in to each other.", "timestamp": "1473859672"}, {"author": "Michael", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811298564052&reply_comment_id=811475489492", "anchor": "fb-811298564052_811475489492", "service": "fb", "text": "&rarr;&nbsp;Jameson I'm interpreting Jeff's \"Waterfront\" as park, and thinking of it as gathering and lingering area, where more interactions occur than on streets and sidewalks.  Your stores and pubs are absolutely legit, though I *feel* like I run into more people shopping in Harvard Square than I do in the Prudential Center.", "timestamp": "1473885119"}, {"author": "Richard", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811298564052&reply_comment_id=811485913602", "anchor": "fb-811298564052_811485913602", "service": "fb", "text": "&rarr;&nbsp;Jeff When I say \"how likely am I to run into someone\" I'm imagining something like this<br>* I'm in area A , a few blocks wide, with 1000 people per block. Directly to my east is a body of water. I'm imaging something like the San Francisco Ferry Building: right next to skyscrapers on the land side, but about half the area in its radius is water. There are, let's say, five thousand people within a two-block radius of me, but they're not diffuse among the whole circle; they're all in the west half of it. I can't just walk onto the water to get away from people if I wanted, either; the only part that's really relevant to me density-wise is the part I can walk on. When I walk in the street, I experience that 1000 person-per-block density; if 1% of the people around are on the street at a given time, I see 10 people each block I walk. By Anthropic density, this has 500 people per block, since half the circle is empty.<br>Compare to<br>* Some as above, I'm at the SF ferry building at the edge of area A, but due to a massive terraforming project, we've built land and skyscrapers to the east at the same 1000 person/block density, a new region called Area B. (The Ferry building has discontinued service due to now being landlocked). There are twice as many people within a two-block radius of me, now, because the east side is all filled up too, but they're also spread over twice as much land. Now I've got 1000 person/block territory on all sides, and every metric agrees that the density is 1000 people per block. When I'm walking on the street, I'll still see 10 people per block. There might be some people who come to Area A from Area B, but there should be roughly as many who come to Area B from Area A, so area A should have the same density in both situations and the same frequency of running into people.<br><br>I think for most purposes, I experience density as people divided by traversable land area and not people divided by indiscriminate surface area. In terms of noise levels or having areas to look out to, the anthropic density calculation makes a lot of sense, but for land use (as you mentioned), frequency with which I run into people, traffic levels, building height, etc, I find dividing by land area to be preferable.", "timestamp": "1473889207"}, {"author": "Jacob", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811312546032", "anchor": "fb-811312546032", "service": "fb", "text": "I went to look up my hometown, and it and most of its neighbors didn't show up in your sheet. Would that be because they're too small or something? (https://en.wikipedia.org/wiki/Stow,_Massachusetts#Geography )", "timestamp": "1473803895"}, {"author": "Mike", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811312546032&reply_comment_id=811322406272", "anchor": "fb-811312546032_811322406272", "service": "fb", "text": "&rarr;&nbsp;Look under \"more apples than people\"", "timestamp": "1473807459"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811312546032&reply_comment_id=811323838402", "anchor": "fb-811312546032_811323838402", "service": "fb", "text": "&rarr;&nbsp;Towns like Stow are doable, I think I just need a different set of shapefiles.", "timestamp": "1473807887"}, {"author": "Jan-Willem", "source_link": "https://plus.google.com/100580955183019057735", "anchor": "gp-1473807936295", "service": "gp", "text": "Oddly this is a bit like image similarity metrics where what you care about is similarity at different scales.  A sort of \"fractal density\".\n<br>\n<br>\nI'm now thinking of measures like the standard deviation of pairwise distances between all pairs of residents.  Or, say, between you and the closest 10.", "timestamp": 1473807936}, {"author": "Mac", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811335949132", "anchor": "fb-811335949132", "service": "fb", "text": "When Berlin was voting on its first mall, I wanted a gee-whiz statistic to fight it.  Percentile of town densities didn't do it because there are so many low density towns in west-central Mass.  We were like 34th percentile.  Whoopee.  What I came up with was \"97% of all Massachusetts citizens live in towns more densely populated than Berlin.\"", "timestamp": "1473814006"}, {"author": "Jameson", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811410414902", "anchor": "fb-811410414902", "service": "fb", "text": "Weighted density is already a thing.", "timestamp": "1473858723"}, {"author": "Nix", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811456332882", "anchor": "fb-811456332882", "service": "fb", "text": "I was wondering if you could use KDEs to do some of the calculation.  Turns out you can!  The code below uses the first two columns of your sheet (lat + long) to compute the third (number of neighbors).  It takes a couple minutes to run, but most of that is an upfront cost -- reading the text file and computing the KDE.  Actually scoring new points on the KDE is pretty fast, and would probably take about 3 hours for the whole data set.<br><br>from sklearn.neighbors.kde import KernelDensity<br>import numpy as np<br>import math<br><br>points = np.loadtxt(\"points-town-density.ma.txt\", usecols = (0,1))<br>radianpoints = points*math.pi/180<br><br>quarter_mile_in_radians = .25*2*math.pi/24901<br><br># Tophat means the KDE will count all elements with distance less than bandwidth.<br># Haversine counts great circle distance based on radian positions on a circle<br>kde = KernelDensity(kernel='tophat',<br>                    metric=\"haversine\",<br>                    bandwidth = quarter_mile_in_radians).fit(radianpoints)<br><br># This will only score the first 10 points right now, but could easily do all of them.<br>raw_results = kde.score_samples(radianpoints[:10])<br><br># The results are returned in log-probibility form.<br># To convert them to # of neighbors, first un-log them, and then scales them by <br># what I assume is the inverse of some integral, but I just backwards derived the number<br># The minus one then subtracts out the point itself.<br>print(np.exp(raw_results)*0.05769864099076194 - 1)", "timestamp": "1473878378"}, {"author": "Nix", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811456332882&reply_comment_id=811460918692", "anchor": "fb-811456332882_811460918692", "service": "fb", "text": "&rarr;&nbsp;Aside from being more efficient, this also allows you to pretty easily change the criterion that counts neighbors.  For example, you could use a gaussian KDE which counts neighbors more the closer they are.", "timestamp": "1473878537"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811456332882&reply_comment_id=811461397732", "anchor": "fb-811456332882_811461397732", "service": "fb", "text": "&rarr;&nbsp;I don't really know how the KDE is implemented: what makes it a more efficient algorithm?", "timestamp": "1473878759"}, {"author": "Nix", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=811456332882&reply_comment_id=811463323872", "anchor": "fb-811456332882_811463323872", "service": "fb", "text": "&rarr;&nbsp;My (weak) understanding is that it relies on more sophisticated nearest-neighbor algorithms, which allow it to quickly figure out what subset of points it should examine to see if they are &lt; 0.25 miles away from a given point.  Specifically, I think they use KD trees, though I don't understand the math behind those well enough to explain it.", "timestamp": "1473879533"}, {"author": "Alexander", "source_link": "https://www.facebook.com/jefftk/posts/811289103012?comment_id=812598194582", "anchor": "fb-812598194582", "service": "fb", "text": "http://wealoneonearth.blogspot.com/.../portrait-of-city... has some density profiles like your final three charts for larger metro areas (though based on census tracts rather than quarter-mile circles, and with the axes swapped and density log-scale).", "timestamp": "1474394683"}]}