Skip to content

Privacy of Groups in Dense Street Imagery

Privacy of Groups in Dense Street Imagery

Jan 2024

Spatially and temporally dense street imagery (DSI) datasets have grown unbounded. In 2024, individual companies possessed around 3 trillion unique images of public streets. DSI data streams are only set to grow as companies like Lyft and Waymo use DSI to train autonomous vehicle algorithms and analyze collisions. Academic researchers leverage DSI to explore novel approaches to urban analysis. To address privacy vulnerabilities, DSI providers have made good-faith efforts to protect individual privacy by blurring sensitive information such as faces and license plates. In this work, however, we find that increased density and innovation in artificial intelligence fail to protect privacy at a group membership level. We perform a penetration test to demonstrate the ease with which group membership inferences can be made from depictions of obfuscated pedestrians in 25,232,608 dashcam images taken in New York City. By synthesizing empirical findings and existing theoretical frameworks, we develop a typology of groups identifiable within DSI and subsequently analyze the privacy implications of information flows pertaining to each group through the lens of contextual integrity. Finally, we discuss actionable recommendations for researchers working with data from DSI providers.

The Robotability Score

The Robotability Score

Jan 2024

The Robotability Score (R) is a novel metric that quantifies how suitable urban environments are for autonomous robot navigation. Through expert interviews and surveys, we've developed a standardized framework for evaluating urban landscapes to reduce uncertainty in robot deployment while respecting established mobility patterns. Streets with high Robotability are both more navigable for robots and less disruptive to pedestrians. We've constructed a proof-of-concept Robotability Score for New York City using a wealth of open datasets from NYC OpenData, and inferred pedestrian distributions from a dataset of 8 million dashcam images taken around the city in late 2023.

Disparities in police deployments with dashcam data

Disparities in police deployments with dashcam data

Jan 2022

Large-scale policing data is vital for detecting inequity in police behavior and policing algorithms. However, one important type of policing data remains largely unavailable within the United States: aggregated police deployment data capturing which neighborhoods have the heaviest police presences. Here we show that disparities in police deployment levels can be quantified by detecting police vehicles in dashcam images of public street scenes. Using a dataset of 24,803,854 dashcam images from rideshare drivers in New York City, we find that police vehicles can be detected with high accuracy (average precision 0.82, AUC 0.99) and identify 233,596 images which contain police vehicles. There is substantial inequality across neighborhoods in police vehicle deployment levels. The neighborhood with the highest deployment levels has almost 20 times higher levels than the neighborhood with the lowest. Two strikingly different types of areas experience high police vehicle deployments — 1) dense, higher-income, commercial areas and 2) lower-income neighborhoods with higher proportions of Black and Hispanic residents. We discuss the implications of these disparities for policing equity and for algorithms trained on policing data.