Skip to content

adversarial-interpretability/adversarial-interpretability

Repository files navigation

Adversarial Examples for Neural Network Interpretability

The large scale results of attack methods against four famous feature-attribution methods

alt text

Examples of targeted attack for semantically meaningful change in feature-importance

alt text

Attack examples on Deep Taylor Decomposition

alt text

About

Adversarial Examples for Neural Network Interpretability

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published