NIST releases cybersecurity playbook for generative AI
The US National Institute of Standards and Technology (NIST) has released a report detailing the types of cyberattacks that could target artificial intelligence systems and possible defenses against them.
The agency believes this report is essential because current defenses against cyberattacks against AI systems are lacking, at a time when artificial intelligence is increasingly pervading every aspect of life and business.
The report is titled “Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigation” and begins with the development of a taxonomy and terminology of adversarial machine learning, which in turn will help secure AI systems as developers will have a uniform basis. from which to form defenses.
The report covers two main types of AI: predictive and generative. These systems are trained on large amounts of data, which attackers can attempt to corrupt. This is not an exaggeration, as these data sets are too large for humans to monitor and filter.
NIST wants the report to help developers understand the types of attacks they can expect and approaches to mitigate them, while recognizing that there is no panacea for defeating the bad guys.
Four main types of attacks against AI systems
- Evasion attacks : Occur after the implementation of an artificial intelligence system, when a user attempts to modify an input to change the system's response. Examples include tampering with road signs to hinder autonomous vehicles.
- Poisoning attacks : occur in the training phase through the introduction of corrupted data. For example, adding various instances of inappropriate language into conversation recordings, so that the chatbot considers them to be commonly used.
- Privacy attacks: These occur during deployment and attempt to obtain sensitive information about the AI or the data it was trained on with the aim of misusing it. An attacker could ask the robot questions and use the answers to reverse engineer the model and find its weaknesses.
- Abuse attacks: These consist of inserting false information into a source from which the AI learns. Unlike poisoning attacks, abusive attacks provide the AI with incorrect information from a legitimate but compromised source, in order to reuse the AI.
However, each of these types can be influenced by criteria such as the attacker's goals, capabilities and knowledge.
“Most of these attacks are quite easy to carry out and require minimal knowledge of the AI system and the adversary's limited capabilities,” said Alina Oprea, co-author and professor at Northeastern University. “Poisoning attacks, for example, can be launched by checking a few dozen training samples, which represents a very small percentage of the entire training set. »
Defensive measures to take include augmenting training data with adversarial examples during training using correct labels, monitoring standard performance metrics of ML models to detect significant degradation of classifier metrics, using data cleaning techniques and other methods.
Leave a Comment