• Posts
  • RSS
  • ◂◂RSS
  • Contact

  • Automatically Updating Apartment Map

    August 19th, 2013
    boston, housing, map, tech  [html]
    I've set up my apartment price map to update automatically. What did this entail? (Warning: boring programming stuff.)

    There are two scripts involved:

    $ crontab -l
    ...
    # fetch the data at 2:02am on the 18th of the month; update the maps
    # at 2:02 on the 19th.
    02 02 18 * *         python /home/jefftk/query_padmapper.py
    02 02 19 * *         python /home/jefftk/draw_heatmaps.sh
    
    The first, query_padmapper.py, pulls apartment data from PadMapper and saves it in a timestamped file like apts-1376823721.txt. Unless Eric changes something, this just does its job. If something goes wrong, cron sends me an email with the error message.

    (I really like this method of doing background tasks. Unless they're critical, don't try to recover from errors. Crash, print something informative, and have it show up in my email.)

    The second script is a wrapper around draw_heatmap.py. It's safe to run any time because it looks for apartment data dumps that haven't been processed yet, but I intentionally only run it when I know it needs ot do something. It is:

    set -e # exit on error
    
    WORKING_DIR="/home/jefftk/jtk/apartment_prices/"
    
    function compute_dates_available() {
      echo '<script type="text/javascript">'
      echo 'var dates_available = ['
      ls $WORKING_DIR/*.boston.bedroom*.png \
         | awk -F. '{print $(NF-1)}' \
         | sort \
         | while read line ; do
        echo '  "'$line'",'
      done
      echo ']'
      echo '</script>'
    }
    
    function update_index() {
      INDEX=$WORKING_DIR/index.html
      cp $INDEX $INDEX.pre_$(date +%F)
      cat $INDEX \
        | grep '<!-- begin list of date files -->' -B 10000 \
        > $INDEX.pre
      cat $INDEX \
        | grep '<!-- end list of date files -->' -A 10000 \
        > $INDEX.post
      compute_dates_available > $INDEX.middle
    
      cat $INDEX.pre $INDEX.middle $INDEX.post > $INDEX
    }
    
    for x in $WORKING_DIR/apts-1*.txt ; do
      if [ ! -e $x.started ] ; then
        YYYYMMDD=$(date --date=@$(echo $x | awk -F/ '{print $NF}' | sed \
                                  s/apts-// | sed s/.txt//) +%F)
        touch $x.started
        for style in room bedroom ; do
          python /home/jefftk/code/apartment_prices/draw_heatmap.py $x $style
          mv $x.$style.1000.png $WORKING_DIR/apts.boston.$style.$YYYYMMDD.png
        done
    
        update_index
        touch $x.finished
      fi
    done
    
    What's this all doing? The for loop at the bottom considers every apartment data file. It only considers the ones where it hasn't started work, which it tracks by creating a file like apts-1376823721.txt.started. It extracts the timestamp from the filename and converts it to a date like 2013-08-18. Then it runs the real code, draw_heatmap.py, which produces an output file like apts-1376823721.txt.room.1000.png. It renames that to the format the UI expects, then calls update_index.

    The update_index function changes a small piece of index.html which has a list of which dates have data available:

    ...
    <!-- begin list of date files -->
    <script type="text/javascript">
    var dates_available = [
      "2011-06-16",
      "2013-01-29",
      "2013-02-18",
      "2013-03-18",
      "2013-04-18",
      "2013-05-18",
      "2013-06-18",
      "2013-07-18",
      "2013-08-18",
    ]
    </script>
    <!-- end list of date files -->
    ...
    
    It figures out the available dates with compute_dates_available, formats that into a javascript array, then uses some grep to replace the portion of the file between the marker comments with the newly calculated dates.

    So now the page stays up to date without me doing anything. Or else I wake up to an error in my email and figure out why.

    Comment via: google plus, facebook

    Recent posts on blogs I like:

    More on the Deutschlandtakt

    The Deutschlandtakt plans are out now. They cover investment through 2040, but even beforehand, there’s a plan for something like a national integrated timetable by 2030, with trains connecting the major cities every 30 minutes rather than hourly. But the…

    via Pedestrian Observations July 1, 2020

    How do cars fare in crash tests they're not specifically optimized for?

    Any time you have a benchmark that gets taken seriously, some people will start gaming the benchmark. Some famous examples in computing are the CPU benchmark specfp and video game benchmarks. With specfp, Sun managed to increase its score on 179.art (a su…

    via Posts on Dan Luu June 30, 2020

    Quick note on the name of this blog

    When I was 21 a friend introduced me to a volume of poems by the 14th-century Persian poet Hafiz, translated by Daniel Ladinsky. I loved them, and eventually named this blog for one of my favorite ones. At some point I read more and found that Ladinsky’s …

    via The whole sky June 21, 2020

    more     (via openring)


  • Posts
  • RSS
  • ◂◂RSS
  • Contact