Civil and Environmental Engineering
Safeguarding data privacy on the road
Follow CDE
PDF Download

A new model based on federated learning supports data collaboration between traffic authorities and mobility forms — without compromising privacy.
Every movement on the road, from sudden braking to avoid a cut-in vehicle to rerouting in response to congestion, generates torrents of data. When aggregated and analysed at scale, these data reveal hidden patterns: how minor disruptions ripple into delays far away, how drivers change routes in response to unexpected slowdowns, and how near-miss incidents might signal future crash risks.
Assistant Professor Yang Kaidi led a team to develop a model based on federated learning that supports data collaboration between traffic authorities and mobility companies.
With the rapid development of AI, traffic authorities are increasingly tapping into this data to improve how they estimate and manage congestion. However, the most accurate view of what’s happening on the road often requires fusing data from multiple sources, be it public traffic sensors, private fleet routes or vehicle telemetry. Each source offers a different piece of the picture. The problem is these pieces are rarely shared.
“Mobility companies worry about revealing their service areas or fleet behaviours. Government agencies have concerns too, and rightly so, like the risk of exposing infrastructure layouts or traffic-control strategies,” explains Assistant Professor Yang Kaidi from the Department of Civil and Environmental Engineering, College of Design and Engineering, National University of Singapore.
Asst Prof Yang led a team to develop a new framework that might just free this data deadlock. Detailed in their paper published in Transportation Research Part C: Emerging Technologies, the framework allows different transportation stakeholders to collaborate and make sense of their combined traffic information, all without divulging any sensitive data to one another.
The team’s technique works like a kind of secret handshake between systems. Instead of pooling raw data in one place, each party keeps their information to itself but participates in training a shared digital model. “Think of it as each party privately working on its part of a puzzle and only exchanging a few hints,” adds Asst Prof Yang. “The model then learns from these encrypted signals to estimate real-time traffic conditions, such as vehicle flow and road density, across a road network.”
This privacy-protecting technique is a departure from how such collaborations are usually done. Most current systems assume all the data can be handed over to a single party, an assumption that breaks down quickly in the real world. The researchers’ framework sidesteps this by borrowing concepts from machine learning and physics-based traffic modelling, and weaving them into a distributed system that plays well with others.
“Our work addresses a very practical issue. Different organisations hold different types of data, and they’re often unwilling, or unable, to share it directly. Our approach helps shatter these siloes,” says Asst Prof Yang.
Building trust while breaking barriers
How well does the team’s model fare in the real world, for example in one of Europe’s most congested capitals? To find out, the researchers tested their model using traffic data from a busy corridor in Athens, Greece. The study area combined two types of data: one from city-installed road sensors, and another from connected cars in private fleets. In real-world settings, this is akin to the public and private sectors each holding their own sliver of the full puzzle. The researchers demonstrated that even without either side sharing their raw data, their combined effort could outperform conventional approaches that require full data disclosure.
In fact, the more secure the exchange felt, the more willing each party became to share richer information, which led to even better results. “Protecting privacy doesn’t have to come at the cost of performance — it can actually improve it,” adds Asst Prof Yang.
“This research is one of the few pioneering efforts to address the cross-company privacy concerns among various transportation entities interested in collaborating and sharing heterogeneous datasets.”
“This research is one of the few pioneering efforts to address the cross-company privacy concerns among various transportation entities interested in collaborating and sharing heterogeneous datasets.”
“This research is one of the few pioneering efforts to address the cross-company privacy concerns among various transportation entities interested in collaborating and sharing heterogeneous datasets.”
The team also built a second version of their model designed for situations where ground-truth data (the kind you get from costly drone surveys or exhaustive traffic measurements) isn’t available. By blending in mathematical models of how traffic behaves, they were able to generate useful estimates even when fine-grained data were missing. This further expands the model’s potential in cities where data collection is limited or exorbitant.
As Asst Prof Yang continues to build on this line of research, he envisions city governments working more closely with mobility firms to simulate road conditions, anticipate traffic bottlenecks, or even fine-tune traffic signal timings — all without crossing data-privacy lines.
The team also built a second version of their model designed for situations where ground-truth data (the kind you get from costly drone surveys or exhaustive traffic measurements) isn’t available. By blending in mathematical models of how traffic behaves, they were able to generate useful estimates even when fine-grained data were missing. This further expands the model’s potential in cities where data collection is limited or exorbitant.
As Asst Prof Yang continues to build on this line of research, he envisions city governments working more closely with mobility firms to simulate road conditions, anticipate traffic bottlenecks, or even fine-tune traffic signal timings — all without crossing data-privacy lines.
“This research is one of the few pioneering efforts to address the cross-company privacy concerns among various transportation entities interested in collaborating and sharing heterogeneous datasets,” Asst Prof Yang adds. “We look forward to extending our privacy-preserving framework to applications like traffic forecasting and city-scale traffic control.”
The team is now developing algorithms that allow stakeholders to generate and share synthetic transportation data — realistic but artificial datasets that preserve privacy while retaining analytical value. To make this accessible to even non-experts, they are building a user-friendly platform powered by large language models that can translate prompts into tailored synthetic datasets.
The researchers are also examining how the public perceives location data privacy in transport systems. Through surveys, the team is studying how individuals weigh the trade-offs between protecting their private information and reaping the benefits of smarter mobility systems, and whether privacy-enhancing techniques can tilt the scales.
Read More
View Our Publications ▏Back to Forging New Frontiers - August 2025 Issue
If you are interested to connect with us, email us at cdenews@nus.edu.sg