{"items": [{"author": "Matt", "source_link": "https://plus.google.com/113951350991359002027", "anchor": "gp-1455384145430", "service": "gp", "text": "Images are not loading on the page?", "timestamp": 1455384145}, {"author": "David&nbsp;Chudzicki", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771096369572", "anchor": "fb-771096369572", "service": "fb", "text": "I'm surprised you said \"There's still just as much getting things wrong...\"<br><br>What we found was a systematic error in the predictions. Correcting something like that should always improve predictions, unless something else goes wrong.<br><br>The problem might partly involve using a correction that's not well suited to the error metric: I multiplicative correction would be best at minimizing percentage error, but I think we were using mean absolute error?<br><br>Even considering that point, I'm still a bit surprised.", "timestamp": "1455385392"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771096369572&reply_comment_id=771543508502", "anchor": "fb-771096369572_771543508502", "service": "fb", "text": "&rarr;&nbsp;Sorry, you're right.  This does improve the predictions.", "timestamp": "1455671186"}, {"author": "Erica", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771099578142", "anchor": "fb-771099578142", "service": "fb", "text": "The images won't load for me (but for the first two where they have the blue links underneath to open in a new window, it worked fine when I did that)", "timestamp": "1455387648"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771099578142&reply_comment_id=771100501292", "anchor": "fb-771099578142_771100501292", "service": "fb", "text": "&rarr;&nbsp;Sorry, fixed!", "timestamp": "1455388122"}, {"author": "Paul", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771099578142&reply_comment_id=771200660572", "anchor": "fb-771099578142_771200660572", "service": "fb", "text": "&rarr;&nbsp;I still have a link to http://www.jefftk.com/apartment-clear-overfitting.png which is a 404.", "timestamp": "1455435512"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771099578142&reply_comment_id=771227821142", "anchor": "fb-771099578142_771227821142", "service": "fb", "text": "&rarr;&nbsp;Gah, sorry!  Fixed all of them now.", "timestamp": "1455463358"}, {"author": "Geoffrey", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771101424442", "anchor": "fb-771101424442", "service": "fb", "text": "This seems like you're using a kernel method for regression and you've chosen a Gaussian kernel. May I suggest looking at the Pearson VII universal kernel (http://www.sciencedirect.com/.../pii/S0169743905001474). It has two variable parameters that allow it to move between a Gaussian and Lorentzian shape. You can set these parameters by a grid search and the same cross validation technique you used to pick sigma for the Gaussian (or leave out more data points per fold) . Might improve your results if you're not completely satisfied with the shape of your final Gaussian.", "timestamp": "1455388710"}, {"author": "Peter", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771114608022", "anchor": "fb-771114608022", "service": "fb", "text": "Are the little dark spots the raw data points? I'm trying to understand the cause of the expensive zone on the Medford-Malden line which is not near good public trans. Are there enough data there or could it be an outlier?", "timestamp": "1455397330"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771114608022&reply_comment_id=771116434362", "anchor": "fb-771114608022_771116434362", "service": "fb", "text": "&rarr;&nbsp;I think your hot spot is a bunch of openings at http://www.livelumiere.com/<br><br>The dots are data points, but one data point isn't usually enough to sway things much. Except in this case, where one new luxury apartment complex is filling up.", "timestamp": "1455398611"}, {"author": "George", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771135541072", "anchor": "fb-771135541072", "service": "fb", "text": "You really should try proper Gaussian process regression. And integrate out the kernel hyperparameters. You also could make separate price predictions for different unit types this way and also predict the relative fractions of different unit types in each area and get a marginal average price that way.", "timestamp": "1455403612"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771135541072&reply_comment_id=771161913222", "anchor": "fb-771135541072_771161913222", "service": "fb", "text": "&rarr;&nbsp;Unfortunately I don't know what most of those things are.<br><br>One big problem with my current predictions is they don't take into account any data from past months.", "timestamp": "1455412067"}, {"author": "George", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771135541072&reply_comment_id=771167177672", "anchor": "fb-771135541072_771167177672", "service": "fb", "text": "&rarr;&nbsp;Jeff&nbsp;Kaufman http://www.gaussianprocess.org/gpml/", "timestamp": "1455415232"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1455463483742", "service": "gp", "text": "@Matt\n\u00a0Sorry, fixed.\n<br>\n<br>\n(Forgot to generate the 1x resolution versions of the images.)", "timestamp": 1455463483}, {"author": "Rio", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771723787222", "anchor": "fb-771723787222", "service": "fb", "text": "Boulder version: http://rioleo.org/apartmentprices/", "timestamp": "1455789534"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771723787222&reply_comment_id=771727574632", "anchor": "fb-771723787222_771727574632", "service": "fb", "text": "&rarr;&nbsp;Neat! That looks like it's still using the old drawing code, though?", "timestamp": "1455797262"}, {"author": "Rio", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771723787222&reply_comment_id=771727879022", "anchor": "fb-771723787222_771727879022", "service": "fb", "text": "&rarr;&nbsp;Where's the new drawing code?", "timestamp": "1455797660"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771723787222&reply_comment_id=771727983812", "anchor": "fb-771723787222_771727983812", "service": "fb", "text": "&rarr;&nbsp;https://github.com/jeffkaufman/apartment_prices", "timestamp": "1455797714"}, {"author": "Rio", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771723787222&reply_comment_id=771730708352", "anchor": "fb-771723787222_771730708352", "service": "fb", "text": "&rarr;&nbsp;(That's what I used; I grepped from the html on the one of the live apartment_prices pages; could it be the new code isn't checked in?)", "timestamp": "1455797829"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771723787222&reply_comment_id=771778936702", "anchor": "fb-771723787222_771778936702", "service": "fb", "text": "&rarr;&nbsp;@Rio: it looks to me like you're missing https://github.com/.../77f4f5b4150326091f3ffda5049aeff470... and all the commits after?  Does the draw_heatmap.py you used contain a function called \"gaussian\"?", "timestamp": "1455824888"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771723787222&reply_comment_id=771778981612", "anchor": "fb-771723787222_771778981612", "service": "fb", "text": "&rarr;&nbsp;Actually, sorry, I think you are up to date.  It's just poor performance in areas with low density of apartment listings.", "timestamp": "1455824963"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771733642472", "anchor": "fb-771733642472", "service": "fb", "text": "Does the code you're running have a gaussian() function defined?<br><br>(The recent commits are pushed: https://github.com/.../apartm.../blob/master/draw_heatmap.py)", "timestamp": "1455800012"}, {"author": "Brandon", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771750648392", "anchor": "fb-771750648392", "service": "fb", "text": "I'm thinking it might be fun to host a little machine learning competition based on the data you've collected. You've collected enough at this point that the training/validation/test sets would all be large enough to objectively rank results. I might could put together a site for it. Interested?", "timestamp": "1455810638"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771750648392&reply_comment_id=771753522632", "anchor": "fb-771750648392_771753522632", "service": "fb", "text": "&rarr;&nbsp;Sure! That sounds great!<br><br>The main problem I see is that I've been making all the data public, so it would be easy to cheat. Is that a worry?", "timestamp": "1455813018"}, {"author": "Brandon", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771750648392&reply_comment_id=771753722232", "anchor": "fb-771750648392_771753722232", "service": "fb", "text": "&rarr;&nbsp;Nah, as long as there are no prizes. All the popular machine learning benchmarks have publicly available test sets and they're still considered valid (though, after a while, they are considered sort of stale, as the _entire body_ of machine learning research overfits to the benchmark. That's basically happened to MNIST and Imagenet.)", "timestamp": "1455813145"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771750648392&reply_comment_id=771753777122", "anchor": "fb-771750648392_771753777122", "service": "fb", "text": "&rarr;&nbsp;Here's the full scrapped data: http://www.jefftk.com/apartment_prices/data-listing<br><br>(Automatically updates once a month.)", "timestamp": "1455813233"}, {"author": "Brandon", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=771750648392&reply_comment_id=771753851972", "anchor": "fb-771750648392_771753851972", "service": "fb", "text": "&rarr;&nbsp;Neat!", "timestamp": "1455813292"}, {"author": "David&nbsp;Chudzicki", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=774433651632", "anchor": "fb-774433651632", "service": "fb", "text": "Related: http://ouzor.github.io/.../03/08/apartment-price-model.html", "timestamp": "1457537045"}, {"author": "Alice", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=774437873172", "anchor": "fb-774437873172", "service": "fb", "text": "Jeff&nbsp;Kaufman It looks like you are re-inventing Random Effects Models! https://en.wikipedia.org/wiki/Random_effects_model", "timestamp": "1457539754"}, {"author": "Eli", "source_link": "https://www.facebook.com/jefftk/posts/771094333652?comment_id=875777298152", "anchor": "fb-875777298152", "service": "fb", "text": "Maybe of interest: Fixing Zillow's pricing algorithm. https://www.nytimes.com/.../angry-over-zillows-home...", "timestamp": "1495979868"}]}