Artificial intelligence data poisoning, particularly the backdoor attack, is anticipated to be among the most challenging threats to counter in the realm of cybersecurity in the coming years. This insight was shared by Kim Hyoung-shick, a cybersecurity specialist and professor at Sungkyunkwan University, in an interview with The Readable on July 25. Kim’s recent paper, titled “Poisoned ChatGPT Finds Work for Idle Hands: Exploring Developers’ Coding Practices with Insecure Suggestions from Poisoned AI Models,” was presented at the IEEE Symposium on Security and Privacy 2024 in May.
AI data poisoning
AI data poisoning is a type of cyberattack that targets AI models, manipulating them to misclassify objects or recommend malicious items. According to Kim, even large-scale models like ChatGPT and Microsoft Copilot are vulnerable to such attacks. This is because most models rely on open-source data that can be uploaded by anyone on the internet. Multiple research papers have demonstrated the feasibility of this type of attack.
AI data poisoning can manifest in several ways, including data injection, data manipulation, and backdoor attacks. Data injection involves inserting invalid data into the AI model or dataset, while data manipulation is achieved by modifying or deleting valid data. Backdoor attacks combine elements of both data injection and manipulation, but they also involve a “trigger” that prompts a specific response. In an experiment conducted by Kim’s laboratory, for example, a model could be poisoned to fail to recognize individuals wearing red hats, with the red hat serving as the trigger to mislead the model.
Filtering or removing samples affected by backdoor attacks is challenging because the model behaves normally until the trigger is activated. Kim noted that tracking all datasets the model has been trained on is very difficult, and even then, detecting whether a dataset contains a backdoor is not straightforward. To recognize and remove backdoor samples, models would need to be trained on such samples, complicating the detection and mitigation process.
The impacts of AI data poisoning
AI data poisoning could have severe consequences, such as deceiving self-driving cars, concealing individuals in CCTV footage, or recommending malicious code. For instance, if code completion or code generation tools are compromised and trained on data containing vulnerable code, these models might produce unsafe code. Kim’s paper included an experiment where cybersecurity experts were tested to see if they would approve of faulty code generated by poisoned models. The results revealed that 70–100% of the experts who used these compromised code-generating models submitted the suggested vulnerable code.
Suggested countermeasures
Despite the known potential impacts of AI data poisoning, there have been no reported instances of it being used with malicious intent thus far. According to the interviewee, this may be due to the absence of a developed revenue model for such attacks. Consequently, the cybersecurity industry needs to continue working on solutions to address this emerging threat.
The most crucial countermeasure is for software developers to thoroughly check code for vulnerabilities, particularly when it is generated by AI. The study found that individuals who referred to the internet for coding produced more secure code compared to those who used generative AI. As a solution, the professional suggested that AI models should generate pseudocode, with developers responsible for writing the actual code.
Another solution is to alert engineers about AI models that utilize open-source data. Developers should be aware of the potential for these models to be targeted by attacks, which could result in faulty code. Additionally, AI models could be designed to highlight parts of the code that require extra attention for security.
Related article: AI security draws unprecedented attention, pulling top brains together in three-day workshop
Hongcheon, Gangwon―Released in 1966, the movie The Good, the Bad and the Ugly depicts an uneasy alliance and struggle among three men on a quest for hidden treasure. Echoing the film’s themes, a group of experts in artificial intelligence security recently held a workshop titled “The Good, The Bad, and The Ugly of AI Security.” This theme reflects the current uneasy relationship between humans and AI, particularly heightened by security concerns.
Kwon Taekyoung, a professor of information security and AI at Yonsei University, has been leading the AI Security Research Group since the beginning of this year. The group, established in 2019, operates under the Korea Institute of Information Security & Cryptology (KIISC), South Korea’s leading academic organization in cybersecurity. It organizes annual events, including the AI Security Workshop.
“As a scholar, I resonate with Andrew Ng’s assertion that ‘AI is the new electricity,’” Kwon said during his opening speech at the 2024 AI Security Workshop, referencing Ng’s statement from early 2017. Kwon emphasized that, like electricity, AI will become integral to every aspect of our lives. However, he also noted that improper use of AI poses significant risks. “Our research group is committed to enhancing AI security so that people can use AI safely and conveniently, just as the pioneers of electricity security worked to ensure safe use of electrical technology,” he added. READ MORE