Forming Adversarial Example Attacks Against Deep Neural Networks With Reinforcement Learning

reportActive / Technical Report | Accesssion Number: AD1212876 | Open PDF

Abstract:

Deep neural networks (DNN) are producing groundbreaking results in virtually all academic and commercial domains and will serve as the workhorse of future human-machine teams that will modernize the Department of Defense (DOD). As such, leaders will need to trust and rely on these networks, which makes their security a paramount concern. Considerable research has demonstrated that DNNs remain vulnerable to adversarial examples. While many defense schemes have been proposed to counter the equally many attack vectors, none have been successful at securing a DNN from this vulnerability. Novel attacks expose blind spots unique to a networks defense, indicating the need for a robust and adaptable attack, used to expose these vulnerabilities early in the development phase. We propose a novel reinforcement learning based attack, Adversarial Reinforcement Learning Agent (ARLA), designed to learn the vulnerabilities of a DNN and generate adversarial examples to exploit them. ARLA was able to significantly degrade the accuracy of five CIFAR-10 DNNs, four of which used a state-of-the-art defense. We compared our method to other state-of-the-art attacks and found evidence that ARLA is an adaptive attack, making it a useful tool for testing the reliability of DNNs before they are deployed within the DOD.

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution Code:
A - Approved For Public Release
Distribution Statement: Public Release.
Copyright: Not Copyrighted

RECORD

Collection: TRECMS
Subject Terms