research report

Sanitization of Transportation Data: Policy Implications and Gaps

Publication Date

November 1, 2021

Author(s)

Areas of Expertise

other

Abstract

Data about mobility provides information to improve city planning, identify traffic patterns, detect traffic jams, and route vehicles around them. This data often contains proprietary and personal information that companies and individuals do not wish others to know, for competitive and personal reasons. This sets up a paradox: the data needs to be analyzed, but it cannot be without revealing information that must be kept secret. A solution is to sanitize the data—i.e., remove or suppress the sensitive information. The goal of sanitization is to protect sensitive information while enabling analyses of the data that will produce the same results as analyses of unsanitized data. However, protecting information requires that sanitized data cannot be linked to data from other sources in a manner that leads to desensitization. This project reviews typical strategies used to sanitize datasets, the research on how some of these strategies are unsuccessful, and the questions that must be addressed to better understand the risks of desensitization.