Abstract
Urban digital twins (UDTs) integrate real-time data from heterogeneous sources to simulate and optimize city operations, but they raise significant privacy concerns due to the centralization of sensitive citizen and infrastructure data. Federated learning (FL) offers a decentralized machine learning paradigm that enables collaborative model training without raw data leaving local nodes, thus preserving privacy. This article proposes a novel FL-based framework tailored for privacy-preserving UDTs, incorporating differential privacy and secure aggregation to mitigate inference attacks. We evaluate the framework using a simulated smart city dataset comprising traffic, energy, and noise pollution measurements from 50 distributed edge nodes. Results demonstrate that the FL approach achieves model accuracy comparable to centralized training (within 2.3% degradation) while reducing data exposure risk by over 90% as measured by membership inference attack success rate. Furthermore, the framework maintains robust performance under non-independent and identically distributed data conditions and communication constraints typical of urban deployments. Our findings indicate that FL can serve as a foundational privacy-preserving technology for UDTs, enabling scalable, collaborative intelligence without compromising data sovereignty. We discuss implications for smart city governance, regulatory compliance, and future integration with blockchain and 6G networks.