Adversarial attack on DL
A demonstration fo adversarial attack on CNN based image classification models
This projecct involves using automatic differentiation’s capabilities to explain model classifications and create adversarial examples. I first explored how gradients can be used to explain which portion of the input the model relied on for making its classification. Then I implemented Grad-CAM from scratch.
Then explored two basic adversarial methods in order to cause ResNet18 to predict another class by perturbing the input images. To improve reconstruction quality.
Key Features:
- Used gradients to highlight important image regions.
- Impelemented Grad-CAM from scratch
- Implemented adversarial methods to create misleading inputs.
- Analyzed model behavior and robustness.
Tools and Technologies:
- Python
- PyTorch
Source Code
The complete source code for this project is available on GitHub.