Skip to contents

This vignette walks through a realistic analysis of Red Line performance in January 2024. By the end you’ll have three charts:

  1. Daily ridership
  2. Average travel time from Park Street → Davis
  3. Where slow zones were on a specific day

Setup

New to these packages? dplyr is for cleaning and reshaping data; ggplot2 is for charts. Install them with install.packages(c("dplyr", "ggplot2")) if needed.

1. Daily ridership

Pull the data

Build a Red Line query, set a date range, and call $ridership(). The result is already a data.frame — no bind_rows() needed.

ridership_df <- tm_rt_query("Red")$date_range("2024-01-01", "2024-01-31")$ridership()
head(ridership_df)

Plot it

ggplot(ridership_df, aes(x = as.Date(date), y = count)) +
  geom_col(fill = "#DA291C") +
  scale_y_continuous(labels = scales::comma) +
  labs(
    title    = "Red Line Daily Ridership",
    subtitle = "January 2024",
    x        = NULL,
    y        = "Riders"
  ) +
  theme_minimal()

What to notice: Weekends typically show lower ridership. If a weekday is unusually low, there may have been a service disruption — check tm_alerts() for that date.

2. Travel time from Park Street to Davis

This uses the aggregate endpoint, which gives you long-run daily averages rather than individual trips.

Pull the data

Specify from/to by station name. The library handles stop ID resolution internally — no need to look up "70076" or "70064".

travel_df <- tm_rt_query("Red")$from_stop("Park Street")$to_stop("Davis")$date_range("2024-01-01", "2024-01-31")$aggregate_travel_times()
head(travel_df)

Clean and plot

travel_df <- travel_df |>
  mutate(
    date       = as.Date(service_date),
    travel_min = mean / 60
  )

ggplot(travel_df, aes(x = date, y = travel_min)) +
  geom_line(color = "#DA291C") +
  geom_smooth(method = "loess", se = FALSE, linetype = "dashed",
              color = "grey40") +
  labs(
    title    = "Average Travel Time: Park Street → Davis",
    subtitle = "January 2024 · Red Line",
    x        = NULL,
    y        = "Minutes"
  ) +
  theme_minimal()

Dashed line is a smooth trend. If travel times creep up over time it can signal slow zones accumulating — which leads us to section 3.

3. Slow zones on a specific day

Speed restrictions (“slow zones”) are places where trains must go slower than normal, usually because of track issues. Unlike most endpoints, this one returns a list (not a data.frame) because its structure is more complex.

slow_raw <- tm_rt_query("Red")$speed_restrictions("2024-01-15")
if (!isTRUE(slow_raw$available)) {
  message("No slow zone data available for this date.")
} else {
  slow_df <- bind_rows(slow_raw$zones)
  slow_df
}
ggplot(slow_df, aes(x = reorder(description, speedMph), y = speedMph, fill = speedMph)) +
  geom_col() +
  coord_flip() +
  scale_fill_gradient(low = "#DA291C", high = "#f9c0c0") +
  labs(
    title    = "Red Line Slow Zones — January 15, 2024",
    subtitle = "Lower mph = more restricted",
    x        = NULL,
    y        = "Speed limit (mph)"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

Note: If slow_raw$available is FALSE, there were no active slow zones that day. Try a different date or check during a period with known track work.

4. Putting it all together

Here’s the pattern you’ll use over and over with transitmattr:

tm_rt_query("Red")             # pick a line
  $stop("Harvard")             # set a station
  $date("2024-01-15")          # set a date
  $headways()                  # fetch — returns a data.frame

Once you’re comfortable with it, try exploring other methods:

  • $line_delays() — delay minutes caused by alerts
  • $trip_metrics("daily") — on-time performance per trip
  • $aggregate_headways() — how often trains come, over time
  • $service_hours("weekly") — scheduled vs. delivered service hours

Troubleshooting

Problem What to try
“requires a stop” error Call $stop("Name") before the terminal method
“requires a date range” error Call $date_range(start, end) before the terminal method
All values are NULL The API may have no data for that date; try a nearby weekday
The API call fails Run tm_healthcheck() to verify the API is up