As machine and deep learning spread across diverse aspects of our society, the concerns about the privacy of the data are getting stronger, particularly in scenarios where sensitive information could be exposed as a result of various privacy attacks. In this dissertation, we focus on a representative Privacy-Preserving Deep Learning (PPDL) technique, Differential Privacy (DP) , a strong notion of privacy that offers robust, quantifiable guarantees by injecting carefully calibrated random noise. However, DP, as well as other PPDL techniques, has largely relied on "one-size-fits-all" mechanisms , that uniformly protect data. This uniformity creates an unavoidable trade-off problem: applying the same level of protection to all data often results in overprotection of non-sensitive information or, conversely, underprotection of specific, highly sensitive data points. This failure arises from ignoring the inherit contextual variability of sensitive data.
This dissertation advances a data-centric perspective built on three principles of context-aware protection: (1) not all parts of a data sample are equally sensitive, therefore privacy should focus on regions of interest (ROI); (2) privacy preferences vary across data owners, requiring personalized mechanisms; and (3) data contains hierarchical levels of sensitivity, necessitating multi-level protection. Guided by these principles, we introduce three novel privacy-preserving mechanisms, each presented in a separate chapter:
DP Patch: ROI-based Approach of Privacy-Preserving Image Processing with Robust Classification. Existing differentially private image anonymization and generation methods consider entire image sample as private, introducing the perturbations to entire pixels or it's feature space representations, which leads to poor utility of the image. However, we argue that to minimize privacy and utility trade-off, the perturbations should be introduced only to the sensitive area, which we define as region of interest (ROI). The proposed framework introduces a multi-stage privacy preservation methodology, which implements a dual function of differential privacy image denoising and ROI-based localization of sensitive contents within an image. Subsequently, the identified areas are protected by integrating DP noise in form of patches. This process results in a privacy-preserving image that is of higher visual quality compared to DP images. Furthermore, a novel custom model is introduced to enrich the feature representation by utilizing both newly generated privacy-preserving images and the original differentially private images to mitigate feature data loss, notably by excluding the noisy patch regions. The effectiveness of the proposed method was validated by assessing the quality of the generated privacy-preserving images, and comparing the performance of the custom model against established models. Moreover, the proposed method's robustness is evaluated against model inversion attacks.
A User-Centric Privacy Transformation Framework for Federated Learning. Privacy preservation is challenging, especially in multi-client environments such as Federated Learning (FL), where diverse clients have varying privacy needs and preferences. To address this, we propose a user-centric privacy-preserving framework, allowing dynamic and customizable privacy adaptation. Unlike traditional approaches, our method enables each FL client to define a user-centric profile, specifying sensitive information beyond standardized privacy constraints. This flexibility ensures that privacy measures are aligned with individual standards while maintaining data usability. To further enhance protection, we introduce adversary-aware transformations, which aim to protect the sensitive attribute from both human and machine adversaries. We formulate this as an optimization problem, aiming to find an optimal privacy budget to defend against both adversaries. The proposed method is empirically evaluated to assess its impact on the global model performance and resistance against privacy attacks. Experimental results demonstrate that our approach effectively mitigates privacy risks while preserving model accuracy, ensuring an optimal trade-off between data confidentiality, compliance, and learning efficacy, which is crucial for real-world applications.
Multi-level Data Sensitivity-Aware Efficient Deep Learning Data Protection Method. The need for privacy preservation when handling different sensitivity levels within datasets has become critical. However, current privacy-preserving methods often treat data uniformly, i.e., overlook hierarchical sensitivity structures. As a result, they fail to account for cross-correlations between features with different sensitivity levels. This can potentially lead to unintended sensitive data exposure through indirect inference. To address this challenge, we propose a sensitivity-aware deep learning method for multi-level data protection, which (1) disentangles correlations between sensitive and less sensitive features in the data pre-processing stage, (2) introduces a sensitivity-aware knowledge distillation technique that supports secure and utility-preserving knowledge transfer, and (3) enables customizable privacy controls based on clearance levels. Also, the proposed approach is adaptable to federated learning environments, ensuring scalability across decentralized settings. To the best of our knowledge, this paper is the first work that explores training deep learning models on sensitive and proprietary data that needs multi-level data protection. Experimental results demonstrate that the proposed method effectively balances data utility and privacy by disentangling cross-sensitivity correlations with minimal performance loss. In federated settings, it maintains strong performance with lower computational overhead, highlighting its scalability and favorable privacy-utility trade-off.
This dissertation introduces three adaptive, data-centric mechanisms that advance Privacy-Preserving Deep Learning by overcoming the limitations of "one-size-fits-all" methods, ensuring strong utility while providing granular, context-aware protection across the spatial, client, and feature dimensions of sensitive data