Tip
This project is a complementary to the Protect AI Model project that can be found in the repo. To this end, to understand the entire idea you shall make sure to read both documentations.
Welcome to the first Edition of ATTACK AI MODEL SIMULATION- attackai Tool official documentation. I am Dr. Deniz Dahman the creator of the BireyselValue algorithm and the author of this package. In the following section you will have a brief introduction on the principal idea of the attackai tool. In addition, a reference to the academic publication on this potential type of cyber attack and potential defence. Before going ahead, I would like to let you know that I have done this work as an independent scientist without any fund or similar capacity. I am dedicated to proceeding and seek further improvement on the proposed method at all costs. To this end if you wish to contribute in any way to this work, please find further details in the contributing section.
If you wish to contribute to the creator of this project and the author, you may want to check possible ways on:
To Contribute in any way possible, thank you, you can check
:
- view options to subscribe on Dahman's Phi Services Website
- subscribe to this channel Dahman's Phi Services
- you can support on patreon
If you prefer any other way of contribution, please feel free to contact me directly on contact.
Thank you
The current revolution of the AI framework has become almost the main element in every solution that we use daily. Many industries heavily rely on those AI models to generate responses accordingly. In fact, it has become a trend that once a product utilizes AI as its backend, then its potential to penetrate marketplace is substantially higher than the one doesn't.
This trend has pushed many industries to consider the implementation of AI models in the tire of their business process. This rush is understandable from the way that those industries believe; for businesses to secure a place in today’s competitive market, they must catch up with the most recent advances in the realm of technology. However, one must ask what is this AI problem solving paradigm anyway?
In my published project the Big Bang of Data Science I do provide a comprehensive answer to this question, from abstract and concrete perspective. But let me just summaries both perspectives in a few lines.
Basically, AI model is a mathematical tool, so to speak. It mainly relies on an important stage, the training stage. As a metaphor, imagine it as a human brain that learns over time from the surroundings and the circumstances where it lives. Those surroundings and circumstances are the cultures, beliefs, people, etc. Once this brain is shaped and formed, it starts to make decisions and offers answers. Yet, we from the outside start to judge those decisions and answers and the brain would react to those judgements.
AI models are mimicking such paradigm. The human brain is the mathematical equation of the model, the surroundings and the circumstances are the training samples that we feed to the mathematical equation to learn, the judgements by the surroundings are the calculation of those misclassified cases which known as obtaining the derivatives. Obviously, we then aim to have a model that can give accurate answers with a minimum level of mistakes.
Once the technical workflow of the AI is understood, it should be clear then that the training samples from which the AI model learns are the most important element of this entire flow. This element can be thought of as the adjudicator whether the model will succeed or fail. To this end, such element is a target for adversaries who aim to fail the model. If such attack is successful, then it’s known as data poisoning attack.
Data poisoning is a type of cyberattack in which an adversary intentionally compromises a training dataset used by an AI or machine learning (ML) model to influence or manipulate the operation of that model
Such type of attack can be done in several ways:
- Intentionally injecting false or misleading information within the training dataset,
- Modifying the existing dataset,
- Deleting a portion of the dataset.
Unfortunately, such cyber-attack could go undetected for so long due to the framework of the AI at the first place. Furthermore, the lack of fundamental understanding of the AI black-box, and the employing of ready-to-use AI models by industry practicians without the comprehensive understanding of the mathematics behind the entire framework.
However, there are some signs that might lead to the observation that the AI model is compromised. Some of those signs are:
- Model degradation: Has the performance of the model inexplicably worsened over time? Unintended outputs Does the model behave unexpectedly and produce unintended results that cannot be explained by the training team?
- Increase in false positives/negatives: Has the accuracy of the model inexplicably changed over time? Has the user community noticed a sudden spike in problematic or incorrect decisions?
- Biased results: Does the model return results that skew toward a certain direction or demographic (indicating the possibility of bias introduction)?
- Breaches or other security events: Has the organization experienced an attack or security event that could indicate they are an active target and/or that could have created a pathway for adversaries to access and manipulate training data?
- Unusual employee activity: Does an employee show an unusual interest in understanding the intricacies of the training data and/or the security measures employed to protect it?
The following gif illustrates the kind of data poisoning attack on AI Model. It basically shows how the alphas or weights are influenced by the new training samples which the model uses to update itself.
To this end, such matters must be considered by the company AI division once they decide to employ the AI problem-solving paradigm.
To understand the consequence of such an attack, it would be great if available tools could simulate the attack itself. This is where I do introduce the attackai tool. This tool basically illustrates two types of attacks:
-
corrupt data sample attack: in this type of attack the attacker manages to corrupt the data sample of AI model during the stage of AI continues learning. Basically, the expected workflow of AI model is that after building the first model, it should continue learning from new samples, and as a result the alphas or as known weights of the model will be updated accordingly. Thus, if a new patch of those samples is corrupted in any way, then the new updated AI model will update the weights based on poisoned samples. This is where symptoms of degraded AI results can be observed then.
-
crazy the model: this second type of attack is way dangerous than the first one. Let me give you an example, if I would teach the model that a certain image of dog is a dog label and another image of a cat is a cat label then the model will construct its internal weights based on this information. If one manages to swap such labels by classifying a dog as a cat and a cat as a dog then this is where the model could go crazy.
Important
This tool illustrates the prior two types of attack for an educational purpose ONLY. The author provides NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
Tip
The simulation using attackai is done on a binary class images dataset, referenced in the below section. The gif illustrations shows the storyline as assumed.
The NIH chest radiographs that support the findings of this project are publicly available at https://nihcc.app.box.com/v/ChestXray-NIHCC and https://www.kaggle.com/c/rsna-pneumonia-detection-challenge. The Indiana University Hospital Network database is available at https://openi.nlm.nih.gov/. The WCMC pediatric data that support the findings of this study are available in the identifier 10.17632/rscbjbr9sj.3
Basically, the storyline of the simulation is simulated on a chest X-ray dataset, outlined above. The outline of the story goes as follows:
- A fictious healthcare center is operating as a chest X-ray diagnosis place.
- In its workflow, called the old paradigm, the case person has a chest X-ray scan; then the scanned image is delivered to a domain expert, the Pulmonologist, to examine the image; and finally, the results are presented to the case person.
- The clinic decides to move to a new paradigm, by integrating a new layer for diagnosis that lays between the case input and the domain expert diagnosis. The new proposed block is to implement an AI agent that essentially diagnosis the case and predict its label as Normal or Pneumonia, then that will pass to the domain expert to confirm.
In the first phase the clinic creates the AI Model, that is by following:
- Main folder that contains subfolders (train, and test); in each there are subfolders of classes for (normal and pneumonia) cases
- The images basically are transformed into tensors which then are fed into a neural network with x hidden layers, the system works until it produces the main predictive model x-ray-ai.h5
- This model contains the valid ratios weights that done the math to make the predictions
The AI team then decides to create the pipeline to update the mode as follows:
- On a weekly basis the new cases are collected
- Then the same setup as mentioned above is created for the folders
- The original model then used to make the current update using the new batches of x-ray images
- Technically speaking, that new input will update the model in a way that change the model weights values
If the adversary has access to the weekly sources from where the model makes its update, then the model over time will have the bad results of any potential fails to make the right predictions. Assume the adversary has that access, then can utilize the attackai to make the attack of any type as outlined in type one attack or type two attack.
Tip
make sure to create the project setup as outlined above.
to install the package all what you have to do:
pip install attackai
You should then be able to use the package. You may want to confirm the installation
pip show attackai
The result then shall be as:
Name: attackai
Version: 1.0.0
Summary: Simulation of poisoning attack on AI model
Home-page: https://github.com/dahmansphi/attackai
Author: Dr. Deniz Dahman's
Author-email: [email protected]
Important
It’s mandatory, to use the first edition of attackai, to make sure the update folder that have the subfolders of the normal and Pneumonia as illustrated in the gif above.
Once your installation is done, and you have met all the conditions, then you may want to check
the build-in functions of the attackai and understand each.
Essentially, if you create an instance from the attackai as so:
from attackai import AttackAI
inst = AttackAI()
now this inst instance offers you access to those build in functions that you need. this is a screenshot:
Once you have attackai instance, here are the details of the right sequence to employ the attack:
In this attack the aim of the adversary is to corrupt the training sample. To illustrate that; you can first make use of the explore_attack_t1()
function. The function takes two main args path to the update folder and size of the attack. the later implies by how mush you wish to corrupt the image. In addtion, there is an optional arg that is the stamp option, which offers to make stamp on the image which you attack. the follwoing graph illustrate that:
once you justify the result you can execute the attack using the execute_attack_t1()
, the results of the attack is illustrated in the following graph:
In this attack the aim of the adversary is to swap the contents of both classes. To illustrate that; you can first make use of the explore_attack_t2()
function. The function takes ONE main arg path to the update folder. the follwoing graph illustrate that:
once you justify the result you can execute the attack using the execute_attack_t2()
.
Once the attack is done, you need to observe the effect, that is the part where you need to check the documentation of the protectai tool. In the documentation, first I will illustrate the result of both attacks, then I propose the method to make the defense from such attack.
please follow up on the publication in the website to find the academic published paper