{"items": [{"author": "Paul", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309928299042481", "anchor": "fb-309928299042481", "service": "fb", "text": "The MBTA determines if a bus is on time by the interval between it and the preceding bus. If, on a route scheduled to have a bus every five minutes, no buses arrive for an hour and then twelve arrive in a bunch, only the first one is considered late; the other eleven arrived within five minutes of the preceding bus. Thus, statistically at least, the on time percentage for that hour is above 90%.", "timestamp": "1327510996"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309934705708507", "anchor": "fb-309934705708507", "service": "fb", "text": "@Paul: that doesn't sound like the right way to do it.", "timestamp": "1327511701"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309953512373293", "anchor": "fb-309953512373293", "service": "fb", "text": "@Ben: \"If B_N is the average wait for bus N, then the mean wait time over some window is the average of all B_N weighted by interval length.\"<br><br>I was using L_i/2 for your B_N and L_i/T for weighting by interval length, but I think we're saying the same thing.<br><br>\"let W_N be the wait from bus N-1 to bus N; B_N = W_N / 2\"<br><br>You're using \"the wait\" to mean both the time between buses (\"the wait from bus N-1 to bus N\") and the time you'd stand there (\"wait for bus N\"), but all you're saying is that if the interval between buses is X then you'd expect to hang around for X/2 before it came, on average.  Which is right.  I'm using L_i for your W_N.<br><br>\"W = sum(W_N/2*W_N)/sum(W_N)\"<br><br>Because sum(W_N) is just the length of the window, I'm calling that T.  Translating into my notation, this is \"W = sum(L_i/2*L_i for i in 1 to N)/T\" which is equivalent to the \"W = sum((L_i/2)*(L_i/T) for i in 1 to N)\" which I'd written.<br><br>So: I think we understand each other and are saying the same thing.", "timestamp": "1327513842"}, {"author": "Ben", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309954212373223", "anchor": "fb-309954212373223", "service": "fb", "text": "Sorry! I realized and ninja-deleted my comment.", "timestamp": "1327513915"}, {"author": "Ben", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309954902373154", "anchor": "fb-309954902373154", "service": "fb", "text": "I somehow thought you were doing some sort of piecewise integration.", "timestamp": "1327513969"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309956885706289", "anchor": "fb-309956885706289", "service": "fb", "text": "@Ben: there might be some role of integration in dealing with an unwindowed form.  The scheduled service interval S is no longer constant.", "timestamp": "1327514194"}, {"author": "Chris", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309976295704348", "anchor": "fb-309976295704348", "service": "fb", "text": "Firstly, when you compute B in the final step, it looks like you've negated it, which I don't understand.<br><br>However, negating that at the end, the number you're calculating looks a bit like the deviance of the interval length:<br><br>sum((L_i - T/N)^2) / N<br><br>I don't think this is a coincidence as they would both be 0 in the optimal case.<br><br>Expanding gives:<br><br>sum(L_i^2) - 2sum(L_iT/N) + sum(T^2/N^2)     / N<br><br>simplify the last part:<br><br>sum(L_i^2) - 2 sum (L_i T/N) + T^2/N     / N<br><br>If we multiply this by N^2/T^2 ( = 1/S^2) , we get:<br><br>1 + N sum((L_i/T)^2) - 2 sum(L_i/T)<br><br>Simplify the last bit and you get:<br><br>1 + N sum((L_i/T)^2) - 2<br><br>N sum((L_i/T)^2) - 1<br><br>So, unless I've done something wrong, you've calculated:<br><br>var(L_i) / S^2<br><br>If you take the square root of your number, you get the stddev(L_i) / S, which would be a pretty reasonable number to study, though so would stddev(L_i).<br><br>Specifically, I would prefer to study stddev or variance of L_i directly.  Which is worse, a bus running every 10 minutes that runs 15 minutes between buses or or a bus running every 60 minutes that runs 85 minutes between buses?", "timestamp": "1327516267"}, {"author": "Chris", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309976522370992", "anchor": "fb-309976522370992", "service": "fb", "text": "s/deviance/variance/", "timestamp": "1327516294"}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309977275704250", "anchor": "fb-309977275704250", "service": "fb", "text": "@Chris: \"when you compute B in the final step, it looks like you've negated it, which I don't understand\"<br><br>Whoops; fixed!", "timestamp": "1327516368"}, {"author": "Chris", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=309977975704180", "anchor": "fb-309977975704180", "service": "fb", "text": "Yeah, I looked at it and calculated the variance and got the negative of what you got.  I checked my work 2 or 3 times and then thought, hmm, I should check his work too.  :)", "timestamp": "1327516450"}, {"author": "Sean", "source_link": "https://plus.google.com/107270646379592003271", "anchor": "gp-1327517942206", "service": "gp", "text": "I think everyone else is commenting on facebook, and I'm afraid my math isn't up to par to comment on that.  However, as I'm a rider, the wait time becomes less important than the accuracy of the schedule when the scheduled intervals &gt; 10 minutes.  Even with my \"Where's My T\" app, when the buses are infrequent, what I want is a bus that shows up and leaves exactly when it says it does, because that's the only way I can successfully plan trips around the buses.  Worse, I've frequently seen bus bunching happening at times when there is supposed to be a significant gap between buses; as much as 30 minutes.  That creates very significant problems for commuters.", "timestamp": 1327517942}, {"author": "Chris", "source_link": "https://www.facebook.com/jefftk/posts/309923625709615?comment_id=310003559034955", "anchor": "fb-310003559034955", "service": "fb", "text": "So, having gone into the math, I can also respond based on my experiences with the T.  The trains do the same thing you're talking about regarding bunching.  My theory is basically the same as yours.  Because the first train took longer, there was more time for the people to get on the platform and thus it takes longer to board and slow down.  Vice versa for the second train.  Thus they get closer and closer together.<br><br>Of course this requires some variance in train times to start with.  In the bus system, traffic is random of course, but on the T, there's no opposing traffic, just the trains in front of and behind you.<br><br>However, there is randomness in the number of riders, but even if there weren't, this pattern could start since there's variation in the number of riders over time and by station.<br><br>I've done the math for this and when the traffic increases smoothly without a smooth change in interval time you get the traffic waves we're discussing, but there's no room in this margin for the proof.<br><br>More truthfully, the math is complicated.  I did about half of it and got bogged down in calculations, so I'm going to go to work and then use wolfram to do the calculations and report back with my findings.", "timestamp": "1327518668"}, {"author": "BDan", "source_link": "https://plus.google.com/103775592027106438640", "anchor": "gp-1327521775496", "service": "gp", "text": "There's a partial solution to bus bunching: have the buses do the same thing that the subway does when it gets bunched, which is have the leading one go express for some number of stops, and let people who are going to intermediate stops get off and take one of the following ones.  I've never seen a bus do this, but unless traffic is \nreally\n bad, it would help significantly to get them back on schedule, I think.", "timestamp": 1327521775}, {"author": "Sean", "source_link": "https://plus.google.com/107270646379592003271", "anchor": "gp-1327521869434", "service": "gp", "text": "The other thing is to have less total stops; most of the bus routes I'm on are in a constant state of slowing down.  If you spread the stops out, you could both enable the lower number of stops and greater consistency of schedule.  I think.  Someone check my intuition with real math!  ;-)", "timestamp": 1327521869}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1327522118308", "service": "gp", "text": "@BDan\n Alternatively, wait until the end of the route then have one bus run express several stops.  Telling people to get off a bus is probably pretty unpopular.", "timestamp": 1327522118}, {"author": "BDan", "source_link": "https://plus.google.com/103775592027106438640", "anchor": "gp-1327522731047", "service": "gp", "text": "@Sean\n Yes, that would help reduce the variance and increase average speed.  The problem is that people tend to howl like banshees if you try to take away their closest stop and make them walk an extra two blocks.  For new routes, though, the stops should always be spaced farther apart.\n<br>\n<br>\n@Jeff&nbsp;Kaufman\n People don't like it when they have to get off the red line because it's running express to Harvard or Alewife, but they do it anyway.  If the buses are bunched early in the route and you wait until the first one reaches the end, a lot of people will take much longer to reach their destinations, and the rest won't get there significantly faster.  Part of the problem may be that there isn't a clear way to deal with fares if buses do this, unlike with subways where people just remain inside the faregates.  (POP would take care of it, or a system that permitted a free transfer to the same route, but neither of those seem particularly likely.)", "timestamp": 1327522731}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1327522807736", "service": "gp", "text": "@Sean\n \"that's the only way I can successfully plan trips around the buses\"\n<br>\n<br>\nThere are many ways of using buses, and this is only one of them.  The three main ones I see people use are (1) show up and wait (2) show up when the schedule says to and (3) show up when the predictions say to.  Decreasing bunching helps people who do (1) and (3) while greater schedule adherence only helps with (2).\n<br>\n<br>\nThere's also already a lot of pressure for schedule adherence and it's easy to measure (there's a screen in most MBTA buses saying how many minutes and seconds early/late they are) so I doubt it's easy to improve on much.", "timestamp": 1327522807}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1327523513197", "service": "gp", "text": "@BDan\n People may accept it more on the red line because passing isn't possible for trains.  It's also much harder to argue with the driver.\n<br>\n<br>\nFares are also a problem, though maybe you could have something where the buses would talk to each other and know to let the people on free.", "timestamp": 1327523513}, {"author": "BDan", "source_link": "https://plus.google.com/103775592027106438640", "anchor": "gp-1327523906619", "service": "gp", "text": "In practice, buses almost never actually pass each other, because it's pretty difficult for them to find room when there's any other traffic, especially on smaller roads.  And if they do manage to pass once, it doesn't really help much, because then the one in front is still making more stops, and the one in back is still stuck behind until it finds a chance to pass.", "timestamp": 1327523906}, {"author": "Sean", "source_link": "https://plus.google.com/107270646379592003271", "anchor": "gp-1327524004177", "service": "gp", "text": "Jeff, not if late/ontime is measured as Paul Baker says it is.  And I would argue that both showing up and waiting and showing up when the predictions say to both benefit by greater schedule adherence, because they can still narrow the window of waiting, or, depending on the bus stop, can allow you to safely do other things while you wait (grab a coffee, for instance) that you might be a bit worried to do if you don't trust the schedule adherence.  Plus, by definition, if the scheduling adherence is better, bunching is less of a problem.", "timestamp": 1327524004}, {"author": "Sean", "source_link": "https://plus.google.com/107270646379592003271", "anchor": "gp-1327524093593", "service": "gp", "text": "BDan, I'm an eternal optimist.  My hope is that, at least with some advertising, people would accept the schedule/stop distance trade-off as a win.", "timestamp": 1327524093}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1327524191360", "service": "gp", "text": "@BDan\n I was talking about passing in the sense that an out of service or express bus can be sent on ahead in a way that a red line train can't be.", "timestamp": 1327524191}, {"author": "BDan", "source_link": "https://plus.google.com/103775592027106438640", "anchor": "gp-1327524549009", "service": "gp", "text": "Right, but there usually isn't one of those available. I think having spare buses at appropriate locations would be great, but then they'd have to pay drivers to just sit around waiting in case they were needed. A few routes (like the 77) sort of do this by having a bus arrive at the terminus just as one is supposed to be leaving, so the new one can leave early if necessary, but in practice that's just the drivers' break time, so it's kind of useless.", "timestamp": 1327524549}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1327524848958", "service": "gp", "text": "@BDan\n \"having spare buses at appropriate locations would be great\"\n<br>\n<br>\nThey do this: \"standby buses are distributed strategically throughout the service area so they can get in position on one of many routes quickly. To keep them available for many routes they are not routinely used to provide extra service.\" \nhttp://mbta.com/uploadedfiles/About_the_T/Score_Card/ScoreCard-2009-09.pdf", "timestamp": 1327524848}, {"author": "BDan", "source_link": "https://plus.google.com/103775592027106438640", "anchor": "gp-1327525190240", "service": "gp", "text": "As far as I can tell, they only use these to cover buses that actually break down, which doesn't cover bunching at all.", "timestamp": 1327525190}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1327525665000", "service": "gp", "text": "@BDan\n This is why I'd like to get the MBTA paying attention to a metric that better represents what it's riders want, like some combination of true on-time-percentage and bunching factor.  Right now the only metric on the scorecard is what percent of scheduled service was actually operated, which isn't that helpful for buses.", "timestamp": 1327525665}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1327525948109", "service": "gp", "text": "(The main thing is that I don't know how to solve this problem fully and I doubt anyone does. But if we have a good metric we can test things against each other.)", "timestamp": 1327525948}, {"author": "David&nbsp;German", "source_link": "https://plus.google.com/111229345142780712481", "anchor": "gp-1327543792755", "service": "gp", "text": "How about a Monte Carlo simulation for W? People materialize at a stop with some probability each minute. You can instantly assign each person a \"real\" W, and compute average B over all the simulated people for a route or system. \n<br>\n<br>\nIn later iterations of the sim, rider volume could change over the day, and some percentages of the population could take into account schedules or predictions.", "timestamp": 1327543792}, {"author": "David&nbsp;German", "source_link": "https://plus.google.com/111229345142780712481", "anchor": "gp-1327543835447", "service": "gp", "text": "Also, couldn't you infer S(t) from the published schedule with less effort and noise than from the location data?", "timestamp": 1327543835}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1327546238359", "service": "gp", "text": "@David&nbsp;German\n I don't understand the advantage of using Monte Carlo for W when we can just compute it.  Just to factor in people paying attention to the published schedule to some extent?\n<br>\n<br>\nYou could get S from the schedule, but if you're already determining arrivals from the data I don't know if it's worth it.", "timestamp": 1327546238}, {"author": "David&nbsp;German", "source_link": "https://plus.google.com/111229345142780712481", "anchor": "gp-1327549831139", "service": "gp", "text": "@Jeff&nbsp;Kaufman\n S makes B measure the transit system's performance relative to its \nintent\n, right?  The schedule is the purest statement of the intent.  Working from the schedule instead of the observed bus rate avoids giving the system credit for buses it fails to run.  It also eliminates the need to cluster, which could get the boundaries wrong, and miss high-frequency schedule variations altogether.", "timestamp": 1327549831}, {"author": "David&nbsp;German", "source_link": "https://plus.google.com/111229345142780712481", "anchor": "gp-1327550477877", "service": "gp", "text": "I was mostly responding to your distaste for windowing: with empirical W and \na priori\n S, you don't need T.  Besides, I do think rider models are an interesting possibility, and animations of the sim would be cool.", "timestamp": 1327550477}, {"author": "Jeff&nbsp;Kaufman", "source_link": "https://plus.google.com/103013777355236494008", "anchor": "gp-1327583612720", "service": "gp", "text": "@David&nbsp;German\n \"avoids giving the system credit for buses it fails to run\"\n<br>\n<br>\nThat makes sense.", "timestamp": 1327583612}]}