Massachusetts DOT Opens Transit Data for Public Access

Massachusetts department of transportation, Mass DOT, recently decided tooffer real time and other transit data on a trial basis to the public. I attended their very first developer conference (official website). My Notes follow:

Open is Good - Don't Build Apps like Bridges

The central message was voiced by many presenters: Open is good for everyone. The case for open is good for the transit agency was nicely explained by Christopher Dempsey of the Executive Office of transportation (EOT).

The DOT would build software applications like they would build a bridge. It can take two years, full bidding and awarding process, costing up to 200,000 dollars, and that's to build one app. For this trail, the DOT opens up the data, let the developer community innovate, and they got some amazing application built in a few months, almost for free minus the operating cost of serving up the data.

Reduce Travel Time, not just Transit Time

The case of open is good for the people is explained by Michael Smith of Next Bus inc. Michael makes a convincing argument that cities need a working mass transit system. A working mass transit system needs customers (riders). There are many ways to make increase ridership, and cutting down travel time (door to door time) is an important part of it. Having real time data available to riders will often mean getting someone who would normally not ride public transit to use public transit.

Power of Technology

Keynote speaker Robin Chase, founder of ZipCar, eloquently explained how the right technology platform can open up a business model. She use "beds" as an example. If you have a spare bed in your guest bedroom, you could share that resource in a very limited way with your friends and family, using the guest bedrooms in your personal network. Hotels on the other hand fully commercialize the bed sharing concept, with huge investments in physical infrastructures. In return hotels needs great profit to continue operation. Compare that to -- no infrastructure, but with the right technology platform, their website, and the community support (their members), it has provided 2.8 million positive experience in 231 countries in 67,438 cities since 2004.

Open Platform, Let the Market Decide

To summarize: provide open access to transit data allow far quicker development of innovative applications, and the market will drive the good applications to the top.

What data is available?

There are really several different types of data being made available:

  • transit schedule/route information - Think of these as bus and train schedules. For Mass DOT, they refer to this as the GTFS files, because they are in Google Transit Feed Specification format.
  • real time vehicle location and arrival / departure prediction -- For Mass DOT this is the real-time XML feed, providing by their service vendor Next Bus.
  • informational alerts and schedule events -- these are short term immediate alerts of equipment problems, elevator outage, route changes, or longer term announcements of constructions.
  • other statistical data -- include accident statistics, which are not available from Mass DOT yet.

Schedule / Route data

There are schedules that are defined by the transit agencies. There typically change several times a year. A schedule is defined via several components -- A route is what we normally think of when we thing of mass transit, Bus 77, or the Silver Line. The route shape defines the long/lat location of each stop, the service -- weekday, sat, sun, the stop time, when the bus (or train or...) are suppose to arrive at each stop, all grouped into trips. A trip is one single "run" of a bus.

Because of Google, (you can get trip planning information from Google map now, select by transit instead of driving when you ), there is a "standard" call GTFS that some agencies use to provide the data. Mass DOT uses GTFS files.

Real Time Data

Real time data is the exciting part of the open initiative. By providing real time information on the location of a vehicle, and the estimated arrival or departure time of a vehicle at each specific stop along a transit route, commuters will benefit. Imagine you can sit in the comfort of your office, watching and waiting for the next bus to arrive to your bus stop. Only when the bus is about to arrive, you can leave and get to the bus without waiting at the bus stop.

Most transit agencies will use a data vendor like Next Bus to provide the real time data. Note that a data processing vendor like Next Bus is not merely a data aggregator. They actually applies their own algorithms to make "prediction" for real time transit information. For example, consider a bus waiting to start it's route at the terminal. What is the arrival time at the first stop? Since the bus has not started moving yet, we need to predict the arrival time using historical data (what's a typical travel time from terminal to first stop at this time of the day), published schedule data (when the bus is suppose to begin the trip).


  • in most cases, the transit agencies own the data. We could argue that the citizen owns the data, since we pay taxes and fairs to keep the agencies operational. So the agencies has to be persuaded to provide open data access.
  • Agencies being a government entities, has formal process in vendor selection. Selecting a vendor like Next Bus to manage and provide access to the transit data requires the typical procurement process which can take time.
  • vehicle location tracking -- this is not simply a GPS issue. GPS information may not be available to a vehicle if it is underground (in a tunnel). Some subways have track segment information available within the subway management system. If the data vendor can access this information, it can be used to locate a train very accurately.
  • vehicle location reporting -- even if a vehicle knows it's own location, it has to report it. Some systems uses radio network, some use cellular network. These are not perfect and not always available.


Currently the Mass DOT has made all their route information available as GTFS files, and on a trial basis made some of the highly use bus routes real time information available. Let's hope that they will officially make real time information available to all routes soon, remember that they have to go thru formal governmental processes to make that happen.