Data Privacy: For All the Cars
In “All the Parties,” a track on his new album titled “For All the Dogs,” Drake raps, “I bought the Rolls just to take it apart,” making reference to a custom Rolls-Royce Cullinan he owns. Now, while Drake is clearly flaunting his lavish lifestyle, the idea of “taking apart” a car sparked a curiosity in me to dissect the evolving landscape of transportation.
One growing trend in the mobility space is on-demand ride hailing, such as Uber, Lyft, and even Northwestern’s Safe Ride. A recent market report found that, the “Global Ride Hailing Services market size was valued at USD 54.5 billion in 2022 and is expected to expand at a CAGR of 21.5% during the forecast period, reaching USD 175 billion by 2028.”
The convenience of requesting rides from anywhere, however, comes with a hidden cost: data collection.
What’s the issue?
Companies, researchers, and other groups collect, store, and analyze anonymized data that includes “location stamps” – the geographical coordinates and time stamps of users. This information is sourced from mobile phone records, credit card transactions, public transportation cards, Twitter profiles, and mobile applications. Combining these datasets offers the potential for valuable insights into human travel patterns, which can be used to optimize transportation systems and urban planning.
On the other hand, location stamps are specific to individuals and can be exploited for malicious purposes. According to a MIT News article written by Rob Matheson, research shows that someone can identify and extract sensitive information about an individual even with just a few randomly selected points in mobility datasets, which becomes more straightforward with combined datasets.
What are the solutions?
Masking location data to prevent user identification in the event of data leaks, misuse, or breaches enhances user privacy but might result in reduced data usefulness and lower efficiency in location-based systems due to information loss.
What is the appropriate balance between ensuring data privacy and optimizing service performance when using a ride-hailing platform?
A paper that came out of MIT’s JTL Urban Mobility Lab in 2018 explored that delicate balance.
The study focused on transportation efficiency, measured by Vehicle Miles Traveled (VMT), and service quality (including waiting and riding times) in the context of daily home-to-work commuting by citizens in Pisa, Italy. The researchers chose this context because work commutes contribute significantly to traffic congestion and pollution and can reveal recurring route patterns and schedules.
The researchers covered three privacy-protecting techniques:
What do these terms mean?
Study Findings
These findings demonstrate that improved VMT outcomes can be achieved when users are willing to make trade-offs between convenience and privacy, mainly by opting for longer travel times rather than extended waiting times. For example, if users are willing to tolerate a detour time of at least 5 minutes, the increase in VMT due to privacy preservation is minimal, at less than a 10% increase. This suggests that by compromising on convenience, it is feasible to protect privacy with only a minor effect on VMT.
Among the privacy methods assessed, k-anonymity consistently surpasses obfuscation, while cloaking becomes the most effective approach when the spatial scope of k-anonymity widens.
The researchers suggest future directions for study: hiding only the employee’s origin for commuting services and exploring temporal privacy by concealing departure times. Additionally, they allude to exploring more advanced location privacy and anonymization methods like differential privacy.
Read More
- Privacy Principles for Mobility Data
- Steering Mobility Data to a Better Privacy Regime
- “The World of Geolocation Data” Infographic
- What Are the Top Data Anonymization Techniques?