This vignette walks through a realistic analysis of Red Line performance in January 2024. By the end you’ll have three charts:
- Daily ridership
- Average travel time from Park Street → Davis
- Where slow zones were on a specific day
Setup
New to these packages?
dplyris for cleaning and reshaping data;ggplot2is for charts. Install them withinstall.packages(c("dplyr", "ggplot2"))if needed.
1. Daily ridership
Pull the data
Build a Red Line query, set a date range, and call
$ridership(). The result is already a
data.frame — no bind_rows() needed.
ridership_df <- tm_rt_query("Red")$date_range("2024-01-01", "2024-01-31")$ridership()
head(ridership_df)Plot it
ggplot(ridership_df, aes(x = as.Date(date), y = count)) +
geom_col(fill = "#DA291C") +
scale_y_continuous(labels = scales::comma) +
labs(
title = "Red Line Daily Ridership",
subtitle = "January 2024",
x = NULL,
y = "Riders"
) +
theme_minimal()What to notice: Weekends typically show lower ridership. If a weekday is unusually low, there may have been a service disruption — check
tm_alerts()for that date.
2. Travel time from Park Street to Davis
This uses the aggregate endpoint, which gives you long-run daily averages rather than individual trips.
Pull the data
Specify from/to by station name. The library handles stop ID
resolution internally — no need to look up "70076" or
"70064".
travel_df <- tm_rt_query("Red")$from_stop("Park Street")$to_stop("Davis")$date_range("2024-01-01", "2024-01-31")$aggregate_travel_times()
head(travel_df)Clean and plot
travel_df <- travel_df |>
mutate(
date = as.Date(service_date),
travel_min = mean / 60
)
ggplot(travel_df, aes(x = date, y = travel_min)) +
geom_line(color = "#DA291C") +
geom_smooth(method = "loess", se = FALSE, linetype = "dashed",
color = "grey40") +
labs(
title = "Average Travel Time: Park Street → Davis",
subtitle = "January 2024 · Red Line",
x = NULL,
y = "Minutes"
) +
theme_minimal()Dashed line is a smooth trend. If travel times creep up over time it can signal slow zones accumulating — which leads us to section 3.
3. Slow zones on a specific day
Speed restrictions (“slow zones”) are places where trains must go slower than normal, usually because of track issues. Unlike most endpoints, this one returns a list (not a data.frame) because its structure is more complex.
slow_raw <- tm_rt_query("Red")$speed_restrictions("2024-01-15")
if (!isTRUE(slow_raw$available)) {
message("No slow zone data available for this date.")
} else {
slow_df <- bind_rows(slow_raw$zones)
slow_df
}
ggplot(slow_df, aes(x = reorder(description, speedMph), y = speedMph, fill = speedMph)) +
geom_col() +
coord_flip() +
scale_fill_gradient(low = "#DA291C", high = "#f9c0c0") +
labs(
title = "Red Line Slow Zones — January 15, 2024",
subtitle = "Lower mph = more restricted",
x = NULL,
y = "Speed limit (mph)"
) +
theme_minimal() +
theme(legend.position = "none")Note: If
slow_raw$availableisFALSE, there were no active slow zones that day. Try a different date or check during a period with known track work.
4. Putting it all together
Here’s the pattern you’ll use over and over with
transitmattr:
tm_rt_query("Red") # pick a line
$stop("Harvard") # set a station
$date("2024-01-15") # set a date
$headways() # fetch — returns a data.frameOnce you’re comfortable with it, try exploring other methods:
-
$line_delays()— delay minutes caused by alerts -
$trip_metrics("daily")— on-time performance per trip -
$aggregate_headways()— how often trains come, over time -
$service_hours("weekly")— scheduled vs. delivered service hours
Troubleshooting
| Problem | What to try |
|---|---|
| “requires a stop” error | Call $stop("Name") before the terminal method |
| “requires a date range” error | Call $date_range(start, end) before the terminal
method |
All values are NULL
|
The API may have no data for that date; try a nearby weekday |
| The API call fails | Run tm_healthcheck() to verify the API is up |