Modeling Improvements
September 16, 2025
10:30 AM – 12:00 PM at Thomas H. Swain RoomModeling my North Face Jacket and Sorel boots
This session focuses on modernizing travel demand modeling and transportation planning through innovative data integration, enhanced reproducibility, and advanced analytical tools. It highlights the challenges posed by changes in demographic data availability and presents solutions using crosswalk methodologies and big data analytics. The session also explores the evolution of travel behavior and its implications for activity-based model recalibration, along with the development of user-friendly utility platforms to democratize access to modeling results.
5 Sub-sessions:Demographic inputs (e.g., household population, group quarters population, households, and median household income) are essential for travel demand modeling. Traditionally, these models rely on Traffic Analysis Zone (TAZ)-level demographic data, which differs from standard Census Bureau geographic units such as blocks, block groups, and census tracts.
With the Census Transportation Planning Products (CTPP) program discontinuing demographic data at the TAZ level, Census Block Groups will now serve as the smallest geography for demographic data. This shift presents significant challenges for established travel demand models, which depend on TAZ-level inputs. Transitioning to a Census Block Group-based system requires substantial time, effort, and cost to update existing models.
To address this, we propose a crosswalk methodology that reallocates demographic data from Census Block Groups to TAZs. Unlike traditional area-based allocation methods, which assume uniform population distribution, our approach uses geocoded address point data. This technique proportionally distributes demographics based on actual residential housing, group quarters, and business locations, ensuring greater accuracy in representing population distribution within each TAZ.
Since demographics are a critical input for transportation models, an effective crosswalk mitigates disruptions from the CTPP change while maintaining model accuracy. This paper presents the methodology, demonstrating how Census Block Group data can be seamlessly integrated into TAZ-based travel demand models, ensuring continuity in planning and decision-making.
Reproducibility is fundamental to the scientific process, allowing researchers and practitioners to build upon existing work, validate findings, and accelerate innovation. In transportation research, the inability to easily reproduce prior studies hinders progress, wasting valuable time and resources. Researchers often struggle to replicate results due to missing code, incomplete data, inadequate documentation, or incompatible computational environments. Despite the recognized importance of reproducibility, a comprehensive, quantitative assessment of its state within the transportation research domain is lacking. This deficiency limits the ability to track progress, identify best practices, and implement effective interventions to improve reproducibility.
This research addresses this critical gap by developing and applying a scalable, automated methodology to measure the availability of features associated with reproducibility in transportation research publications. We define a set of key features that indicate the potential for reproducibility, including: code availability, data availability, completeness of setup instructions, presence of data dictionaries, use of interactive notebooks, level of code documentation, and informativeness of variable names.
A significant challenge in assessing these features is the heterogeneity of software, programming language, and types of data used in transportation research, encompassing various programming languages and specialized software packages without any standard to encode this information. To overcome this, we leverage large language models (LLMs) to generate tailored scripts for feature extraction from published papers and diverse codebases. These LLM-generated scripts are carefully reviewed and validated by the research team to ensure accuracy.
Using this approach, we analyze a corpus of over 10,000 papers published between 2019 and 2024 in leading transportation journals, specifically Transportation Research Parts A-F and Interdisciplinary Perspectives. We will present a detailed analysis of code and data availability , the prevalence of all defined reproducibility features, temporal trends in reproducibility practices, and variations across different journals and research subfields within transportation.
This research directly addresses the MOMO 2025 "Connect" theme by bridging the gap between academic research and planning practices to improve the reproducibility of transportation research. The findings will empower journal editors, funding agencies, researchers, and practitioners to promote better reproducibility practices. The automated methodology developed in this work offers a scalable solution for ongoing monitoring of reproducibility, potentially enabling the integration of automated reproducibility metrics into journal submission systems. For instance, leveraging Elsevier's API, we demonstrate a practical pathway to extract these metrics directly from publication in the Transportation Research journal series. This initiative will foster a culture of transparency and collaboration within the transportation research community, ultimately leading to more robust, reliable, and impactful research that can better inform transportation planning and policy decisions. The open-source code and derived data will be made publicly available on github, further connecting researchers and facilitating the adoption of best practices.
The presentation describes the recent update of the regional Activity-Based travel Model (ABM) of the Maricopa Association of Governments (MAG) with the new Household Travel Survey (HTS) and other available sources of information. The MAG ABM was originally developed based on HTS 2008. It was subsequently updated based on HTS 2017, and now with HTS 2024. Having three detailed HTSs for three different years and utilizing them for ABM calibration provided a unique basis for analysis of the changes in travel behavior. This also led us to certain conclusions on how the entire process of periodic model recalibrations can be better structured. The presentation has two focus points:
· Evolution of travel behavior. We compare detailed results – from three surveys across multiple travel dimensions – used as targets for model calibration while controlling for a wide set of socio-economic parameters. Those include car ownership, commuting & telecommuting characteristics, daily activity pattern generation, tour structures, mode choice, time-of-day distributions, etc. Special emphasis is given to the post-pandemic trends of more frequent telecommuting, transit-averse behavior, interest in active modes, traffic peak spreading, trip length distribution changes, and others.
· Implications for ABM recalibration. The paper discusses how HTS and other sources of information (such as traffic counts, transit on-board survey, big data, and additional data sources) were integrated in the ABM update. It substantiates which sub-models require a complete re-estimation from scratch and which ones can be “boosted” by a partial recalibration while still preserving the main historical patterns (essentially combining data from different HTSs for model calibration). After a revision of each sub-model with the HTS data, the entire model system is validated using traffic counts and big data. At this point a decision is made on whether an additional adjustment of model parameters may be needed. These additional adjustments are less straightforward compared to sub-model calibration. The presentation provides practical templates for typical steps at this stage. They include systematic analysis of the discrepancies between traffic counts and/or big data, identification of the sub-models and specific parameters as the best candidates for adjustment, iterative update of parameters, etc.
We believe that this effort is very typical for maintaining any travel forecasting model up-to-date and will stimulate a useful discussion at the conference. Rather than a major update of the travel model every 5-10 years, we envision a systematic process of “boosting” on a significantly shorter time frame, utilizing various available sources of information including HTS, traffic counts, big data, etc.
Background
Several third-party big data analytic vendors, such as StreetLight, Replica, INRIX, and AirSage, provide detailed insights into corridor traffic characteristics, including origin-destination flow, network performance, trip characteristics, and turning movement counts. MPOs, DOTs, consultants, and general users of these data sources have created various applications to summarize and visualize this data. This presentation describes one such dashboard created in MS Excel with a Python backend for a corridor improvement project on Route 17 in New York. The dashboard is used by the agency to identify potential locations for congestion relief measures and to calibrate the traffic microsimulation model.
Description of Application
The NYSDOT Route 17 Mobility & Access Improvements Project in Orange and Sullivan Counties, New York, aims to address operational and safety deficiencies resulting from non-standard and non-conforming geometric design elements. The project seeks to achieve interstate standards and improve congestion-related travel times during peak periods. To support the project's needs and gain insights into seasonal variation and growth, big data analytics were employed to supplement field traffic data collected in May and June 2023. Three sets of travel metrics—travel times, traffic flow, and average trip speed through the corridor—from January 2022 to December 2024 were compiled from big data sources for the entire project corridor. Additionally, hourly travel time data for each Tuesday, Wednesday, Thursday, Friday, and Sunday throughout 2023 was compiled to perform cluster analysis and calibrate the microsimulation model in accordance with the FHWA 2019 update to the Traffic Analysis Toolbox Volume III guidelines.
A Python-based software tool processed these analyses and collated the results into a user-friendly spreadsheet dashboard. This tool allows users to specify origin, destination, day of the week, season, and year to obtain detailed information on hourly travel time, average speed, and total origin-destination (OD) traffic. It also generates charts of seasonal travel time and OD traffic flow, along with 15-minute and hourly average speed heatmaps showing corridor conditions at different times of the day by season. These results helped analyze traffic operations and identify existing deficiencies in accommodating seasonal and diurnal traffic variations to inform potential improvements. Efforts are underway to make the Python software open-source for broader use in similar projects.
Statement on Why Application is Noteworthy
This work presents a simple framework for summarizing and reporting traffic characteristics from big data sources. The software developed to create the dashboard uses simple YAML and CSV files as parameters to read big data outputs, making it easy to apply to different projects. The dashboard itself is accessible to a wide audience with varying technical expertise due to its basis in simple Excel functions. Although the dashboard was developed using StreetLight data, it can be easily adapted to read data from other vendors.
Project Status
The application is complete, and the primary project for which this application was developed is currently ongoing.
Abstract Background
The Boston Region Metropolitan Planning Organization (MPO) released its regional travel demand model TDM23 for the general public in early 2024, which supports various research and project applications in the MPO. TDM23 has proven to be a robust travel demand model. However, generating ad hoc reports via Jupyter notebooks has required meticulous coding and fine-tuning, often demanding significant time investment to setup and maintain the data analytics and visualization environments, rather than focusing on analytical insights. To address these inefficiencies, a dedicated utility platform was developed to automate output processing. This platform allows users to bypass the cumbersome environment setup and concentrate on analyzing reports and map visualizations, whether for internal use or sharing with external stakeholders.
Description of Abstract
This presentation introduces a utility platform designed to democratize travel demand modeling, expanding access for a broader range of users and improving the efficiency of data delivery to the end user. This platform offers an innovative, open platform through a user-friendly portal with various utilities tailored to model application and research needs. It bridges the gap between model developers, users, and stakeholders with a centralized access point - utilities portal. Developers provide standardized code scripts, and the platform handles the technical complexities. Users can effortlessly navigate the portal, choose the tools they need, and perform analyses, with the platform delivering shareable results - empowering both novice and expert users to conduct meaningful analyses.
For instance, the Mode Shift tool enables the project analytical team to evaluate changes in transit mode share between zone groups which generates an interactive HTML report specific to the project study area. In order to evaluate the simulated scenarios for the constrained parking space in the Boston Park and Ride lots, a Park and Ride tool is developed as an interactive map application to rapidly assess parking demand from transit-auto mode passengers by visually comparing assigned transit trips at each lot.
The platform's architecture stems from the four key phases of modeling: design, implementation, maintenance, and application. Real-world case studies will be presented to demonstrate the added value the platform brings to each of these stages, benefiting all stakeholders involved.
Statement on Why Abstract is Noteworthy
This platform is noteworthy for its ability to empower average modelers and users to easily explore scenario-based model results. By advancing the practice of Modeling and Simulation-as-a-Service, it fosters greater collaboration and knowledge sharing. The platform promotes open access to modeling tools and analysis, aligning with the principles of open-source software. While the TDM23 model itself may have licensing restrictions, the utilities developed on the platform could be shared more broadly.
Project status
The utility platform and its features are now stable - new visualization and analytic apps can be added to the existing infrastructure. The official release is targeted for March 2025, to smooth out wrinkles in the platform as well as to add additional features and utility apps. This project directly supports the ongoing public planning processes within the Boston Region MPO.