• Posts
  • RSS
  • ◂◂RSS
  • Contact

  • Rent Map Data Sources

    October 29th, 2016
    map, housing  [html]
    I just finished updating my rent map to handle the recent Padmapper UI refresh, and someone asked how Padmapper not including Craigslist listings affected the map. This confused me: I had thought Padmapper got its data by buying it from 3Taps who scraped Google's cache who crawled Craigslist. But it turns out that Padmapper and 3Taps settled the lawsuit, and Padmapper has only gotten its listings from other sources since then.

    One issue, though, is it could be that the cheapest apartments are listed only on Craigslist [1] and not on the other services that Padmapper pulls from. To get a rough check this, I took ten random listings from the Boston Craigslist page, and tried to figure out which Padmapper listing it went with.

    Predictions summary:
    Listing Estimate Error In Padmapper
    $1700 $2605 +53% no
    $1700 $2065 +21% no
    $1495 $1775 +19% no
    $3000 $3155 +5% yes
    $3000 $2995 -0% no
    $1575 $1590 -1% yes
    $2900 $2800 -3% yes
    $6000 $3650 -64% no (dubious)
    While this is a small sample, it looks like the predictions are pretty good for the ones in Padmapper (which is what you would expect) and consistently too high (0%, 19%, 21%, 53%, avg=23%) for the ones not in Padmapper.

    Fixing this is pretty tricky. I could do a larger sample to try to get a better sense of what the error is, and then adust my map down by the combination of how much lower the non-Padmapper apartments are and what fraction aren't in padmapper. In this case, ignoring the dubious listing, 4 of 9 weren't on padmapper, with an average error of 23%, that would mean adjusting all my estimates down by 10%. On the other hand, as people's listing behavior changes this could get obsolete pretty quickly, and it's a pain to calculate the first time let alone on an ongoing basis. Ideas?


    [1] Or, worse for my map, listed only with signs in windows or something else not available online.

    Comment via: google plus, facebook

    Recent posts on blogs I like:

    More on the Deutschlandtakt

    The Deutschlandtakt plans are out now. They cover investment through 2040, but even beforehand, there’s a plan for something like a national integrated timetable by 2030, with trains connecting the major cities every 30 minutes rather than hourly. But the…

    via Pedestrian Observations July 1, 2020

    How do cars fare in crash tests they're not specifically optimized for?

    Any time you have a benchmark that gets taken seriously, some people will start gaming the benchmark. Some famous examples in computing are the CPU benchmark specfp and video game benchmarks. With specfp, Sun managed to increase its score on 179.art (a su…

    via Posts on Dan Luu June 30, 2020

    Quick note on the name of this blog

    When I was 21 a friend introduced me to a volume of poems by the 14th-century Persian poet Hafiz, translated by Daniel Ladinsky. I loved them, and eventually named this blog for one of my favorite ones. At some point I read more and found that Ladinsky’s …

    via The whole sky June 21, 2020

    more     (via openring)


  • Posts
  • RSS
  • ◂◂RSS
  • Contact