28 February 2022

Poisoning attacks in Federated Learning

Federated learning is a double-edged sword in that it is designed to ensure data privacy, yet unfortunately, it opens a door for adversaries to exploit the system easily. One of the popular attack vectors is a poisoning attack.

What is a poisoning attack?

A poisoning attack aims to degrade machine learning models easily that can be classified into two categories: data and model poisoning attacks.

Data poisoning attacks aim to contaminate the training data to indirectly degrade the performance of machine learning models [1]. Data poisoning attacks can be broadly classified into two categories: (1) label flipping attacks in which an attacker "flips" labels of training data [2] (2) backdoor attacks in which an attacker injects new or manipulated training data, resulting in misclassification during inference time [3]. An attacker may perform global or targeted data poisoning attacks. Targeted attacks add more challenges to be detected as they only manipulate a specific class and make data for other classes intact.

On the other hand, a model poisoning attack aims to directly manipulate local models to compromise the accuracy of global models. Model poisoning attacks can be performed as untargeted or targeted attacks similar to data poisoning attacks. An untargeted attack aims to degrade the overall performance and achieve a denial of service. A targeted attack is a more sophisticated way to corrupt model updates for subtasks while maintaining high accuracy on global objectives [1].

A federated learning system is inherently vulnerable to poisoning attacks due to its inaccessibility to local data and models in individual participants. Targeted attacks make the problem worse as it would be extremely hard for a central server to identify attacks in the models given high accuracy on global objectives. 

How can we defend?

Approaches to defend against poisoning attacks can be classified into two categories: (1) Robust Aggregation and (2) Anomaly Detection.

A typical aggregation method in the federated learning system is the average of local models to get a global model. For example, each client computes gradients in each round of training phases, the gradients are sent to a central server, and the server computes the average gradients in FedSGD [4]. For better efficiency, each client computes the gradient and updates the model for multiple batches, and the model parameters are sent to the server, and the server computes the average of the model parameters in FedAvg [5]. Apparently, these average-based approaches will be susceptible to poisoning attacks. The research focus has been thus on how to better aggregate the model parameters while minimizing such as median aggregation [6], trimmed mean aggregation [6], Krum aggregation [7], or adaptive averaging algorithms [1].

A more proactive way to defend against poisoning attacks is filtering malicious updates through anomaly detection. Model updates of malicious clients are often distinguishable from those of honest clients. One defense method is clustering-based approaches that check model updates at the central server, cluster model updates, and filter suspicious clusters from aggregation [8]. Behavior-based defense methods measure differences in model updates between clients and filter malicious model updates from aggregation [8].


A federated learning system has recently emerged and thus the research on attacks against it is still in its early stage. To fully take advantage of the promising potentials of the federated learning system, a lot of research efforts are needed to provide robustness against poisoning attacks.


most read