Tip
This project is a complementary to the Attack AI Model project that can be found in the repo. To this end, to understand the entire idea you shall make sure to read both documentations.
Welcome to the first Edition of PROTECT AI MODEL SIMULATION- protectai Tool official documentation. I am Dr. Deniz Dahman the creator of the BireyselValue algorithm and the author of this package. In the following section you will have a brief introduction on the principal idea of the protectai tool. In addition, a reference to the academic publication on this potential type of cyber attack and potential defence. Before going ahead, I would like to let you know that I have done this work as an independent scientist without any fund or similar capacity. I am dedicated to proceeding and seek further improvement on the proposed method at all costs. To this end if you wish to contribute in any way to this work, please find further details in the contributing section.
If you wish to contribute to the creator of this project and the author, you may want to check possible ways on:
To Contribute in any way possible, thank you, you can check
:
- view options to subscribe on Dahman's Phi Services Website
- subscribe to this channel Dahman's Phi Services
- you can support on patreon
If you prefer any other way of contribution, please feel free to contact me directly on contact.
Thank you
The current revolution of the AI framework has become almost the main element in every solution that we use daily. Many industries heavily rely on those AI models to generate responses accordingly. In fact, it has become a trend that once a product utilizes AI as its backend, then its potential to penetrate marketplace is substantially higher than the one doesn't.
This trend has pushed many industries to consider the implementation of AI models in the tire of their business process. This rush is understandable from the way that those industries believe; for businesses to secure a place in today’s competitive market, they must catch up with the most recent advances in the realm of technology. However, one must ask what is this AI problem solving paradigm anyway?
In my published project the Big Bang of Data Science I do provide a comprehensive answer to this question, from abstract and concrete perspective. But let me just summaries both perspectives in a few lines.
Basically, AI model is a mathematical tool, so to speak. It mainly relies on an important stage, the training stage. As a metaphor, imagine it as a human brain that learns over time from the surroundings and the circumstances where it lives. Those surroundings and circumstances are the cultures, beliefs, people, etc. Once this brain is shaped and formed, it starts to make decisions and offers answers. Yet, we from the outside start to judge those decisions and answers and the brain would react to those judgements.
AI models are mimicking such paradigm. The human brain is the mathematical equation of the model, the surroundings and the circumstances are the training samples that we feed to the mathematical equation to learn, the judgements by the surroundings are the calculation of those misclassified cases which known as obtaining the derivatives. Obviously, we then aim to have a model that can give accurate answers with a minimum level of mistakes.
Once the technical workflow of the AI is understood, it should be clear then that the training samples from which the AI model learns are the most important element of this entire flow. This element can be thought of as the adjudicator whether the model will succeed or fail. To this end, such element is a target for adversaries who aim to fail the model. If such attack is successful, then it’s known as data poisoning attack.
Data poisoning is a type of cyberattack in which an adversary intentionally compromises a training dataset used by an AI or machine learning (ML) model to influence or manipulate the operation of that model
Such type of attack can be done in several ways:
- Intentionally injecting false or misleading information within the training dataset,
- Modifying the existing dataset,
- Deleting a portion of the dataset.
Unfortunately, such cyber-attack could go undetected for so long due to the framework of the AI at the first place. Furthermore, the lack of fundamental understanding of the AI black-box, and the employing of ready-to-use AI models by industry practicians without the comprehensive understanding of the mathematics behind the entire framework.
However, there are some signs that might lead to the observation that the AI model is compromised. Some of those signs are:
- Model degradation: Has the performance of the model inexplicably worsened over time? Unintended outputs Does the model behave unexpectedly and produce unintended results that cannot be explained by the training team?
- Increase in false positives/negatives: Has the accuracy of the model inexplicably changed over time? Has the user community noticed a sudden spike in problematic or incorrect decisions?
- Biased results: Does the model return results that skew toward a certain direction or demographic (indicating the possibility of bias introduction)?
- Breaches or other security events: Has the organization experienced an attack or security event that could indicate they are an active target and/or that could have created a pathway for adversaries to access and manipulate training data?
- Unusual employee activity: Does an employee show an unusual interest in understanding the intricacies of the training data and/or the security measures employed to protect it?
This gif illustrates the kind of data poisoning attack on AI Model. It basically shows how the alphas or weights are influenced by the new training samples which the model uses to update itself.
To this end, such matters must be considered by the company AI division once they decide to employ the AI problem-solving paradigm.
I have presented two types of attacks that AI model can face; the first type is corrupt data training sample, and the second is crazy the model. Both are explained in detail in the repo project of the attackai. I have illustrated the image of both attacks. now I shall:
-
first, illustrate a healthy image of AI model, in this case I will carry on the same simulation using the scenario of the storyline, so I will show the first original creation of the x-ray-ai.h5 model. Check the details.
-
Second, I will show the effect of both attacks that I illustrated using the attackai tool. You may want to check the illustration of attack type one, and attack type two. As a result, will observe the performance after both attacks.
-
Finally, I shall present the proposal of defence strategy from both attacks.
Here, I introduce the tool protectai, which basically is going to accomplish the three outlines motioned above. .
Important
This tool illustrates the prior three objectives for an educational purpose ONLY. The author provides NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
Tip
The simulation using protectai is done on a binary class images dataset, referenced in the below section. The gif illustrations shows the storyline as assumed.
The NIH chest radiographs that support the findings of this project are publicly available at https://nihcc.app.box.com/v/ChestXray-NIHCC and https://www.kaggle.com/c/rsna-pneumonia-detection-challenge. The Indiana University Hospital Network database is available at https://openi.nlm.nih.gov/. The WCMC pediatric data that support the findings of this study are available in the identifier 10.17632/rscbjbr9sj.3
Basically, the storyline of the simulation is simulated on a chest X-ray dataset, outlined above. The outline of the story goes as follows:
- A fictious healthcare center is operating as a chest X-ray diagnosis place.
- In its workflow, called the old paradigm, the case person has a chest X-ray scan; then the scanned image is delivered to a domain expert, the Pulmonologist, to examine the image; and finally, the results are presented to the case person.
- The clinic decides to move to a new paradigm, by integrating a new layer for diagnosis that lays between the case input and the domain expert diagnosis. The new proposed block is to implement an AI agent that essentially diagnosis the case and predict its label as Normal or Pneumonia, then that will pass to the domain expert to confirm.
In the first phase the clinic creates the AI Model, that is by following:
- Main folder that contains subfolders (train, and test); in each there are subfolders of classes for (normal and pneumonia) cases
- The images basically are transformed into tensors which then are fed into a neural network with x hidden layers, the system works until it produces the main predictive model x-ray-ai.h5
- This model contains the valid ratios weights that done the math to make the predictions
The AI team then decides to create the pipeline to update the mode as follows:
- On a weekly basis the new cases are collected
- Then the same setup as mentioned above is created for the folders
- The original model then used to make the current update using the new batches of x-ray images
- Technically speaking, that new input will update the model in a way that change the model weights values
The AI team has decided to add a security layer before the update of the AI model. Basically, the new security layer shall inspect the weekly batches before feeding them to the AI model. The new security layer shall perform inspection against:
Tip
make sure to create the project setup as outlined above.
to install the package all what you have to do:
pip install protectai
You should then be able to use the package. You may want to confirm the installation
pip show protectai
The result then shall be as:
Name: protectai
Version: 1.0.3
Summary: Simulation of protecting AI model from poisoning attack
Home-page: https://github.com/dahmansphi/protectai
Author: Dr. Deniz Dahman's
Author-email: [email protected]
Important
It’s mandatory, to use the first edition of protectai, to make sure the update folder that have the subfolders of the normal and Pneumonia as illustrated in the gif.
Once your installation is done, and you have met all the conditions, then you may want to check
the build-in functions of the protectai and understand each.
Essentially, if you create an instance from the protectai as so:
from protectai.protectai import ProtectAI
inst = ProtectAI()
now this inst instance offers you access to those build in functions that you need. this is a screenshot:
Once you have protectai instance, here are the details of the right sequence to employ the defence:
first we shall call the set_path_models()
function to prepare the training and testing dataset with the spec for the main model.
inst.set_path_models(paths)
the function takes one argument that is the path to the training and testing folders. or basically, any folders that shall want to be in tandem with creation a predictive model of images. Here are some pictures illustrate the result after calling the function.
This function as the name implies, is to create the skelaton AI predictive model. This is a neural network model that accept binary type of classess
inst.make_model()
In the standard workflow to create a neural network we use training samples to train the model and eventually the model would have its set of ratios/weights. However, we should test that trained model. To this end, this function train_test_model()
does that. It requires two arguments, the first is the skeleton of the neural network model created in previous function, and the second is a list of paths to the training and testing folders.
inst.train_test_model(model, paths)
The result of the training and testing can be seen in the figure below:
Once the result is fine and satisfactory then you can move and create the original model.
Once these results from the train and test are satisfactory then you can use the produce_model()
function to output the model. This function takes two arguments, the first is the path to the skeleton model we output from above, the second argument is a list contains the path to the training samples, and path where to save the new model. Once this function is executed then you have the original model as shown in the picture below. notably you have options of the type of the saved model, e.g. h5, Keras, or any form of structure you wish to save the ratios and any other information.
inst.produce_model(model=model, paths_folder=paths_produce)
Since the model is ready, it is time now to see the model in action. There is the use of the predict()
function. This function takes two arguments; the first one is the path to the folder of images to make prediction on their labels, and the path to the original model.
inst.predict(img_path=path_to_imgs, model_path=orignial_model)
if we, just for testing purpose, point the model to folder of normal imgages we could see the results as in the figure below with accuracy of 100%, and the same for the Pneumonia images we could also see almost 100%.
I have discussed in detail the workflow of updating the existing model in the pipeline workflow section. To make that workflow in action the use of the inst.update_trained_model_with_test()
and inst.update_trained_model_produce()
is the right order. As the name suggests, the first function does the update with test and the second does the update with save new mode updated. Basically, both functions take two arguments; the first argument is to the model which has existed to update, the second argument is path to the new weekly updated X-ray images.
inst.update_trained_model_with_test(model=paths_to_existing_model, paths=paths_new_imgs)
inst.update_trained_model_produce(model=paths_to_existing_model, paths=paths_new_imgs)
Assume either type of attacks has happened as discussed in types of attack. We could see the effect of that attack on the accuracy of the model using:
inst.update_trained_model_produce(model=paths_to_existing_model, paths=paths_new_imgs)
and
inst.predict(img_path=path_to_imgs, model_path=orignial_model)
Observe that the prediction tasks which we hit 100% for both classes is now at the same level of accuracy for normal images, however it is almost 0% for the other class of pneumonia images. that means if a person actually has cancer the model says they are not; as presented in the figures below.
result of updated model accuracy
result of prediction on normal classes
result of prediction on pneumonia classes
Caution
You can see the effect of such an attack how misleading it is. This is a sample illustration that we can see its effect after one round, however such idea could go unpredicted for some time by poisoning the model slowly until it loses its main core of ratios.
As I have discussed earlier in the simulative storyline example, if the AI team decided to add a security layer before the actual update happens. This is where I do introduce the norm culture solution. The mathematical intuition behind the method can be found in the academic published paper, however in this demonstration I will show how such layer could help to capture the images of which they have been attacked by either type one or two.
In the following figure you can see where the location of this norm culture box. It will be basically the stage where you can check if the folder of the new training images are compromised in anyway. To employ the norm culture box we shall be using inst.protect_me()
function which in principle calculate four main elements:
- two numpy where each represents the number of classes called alpha
- two norms where each represents the number of classes called norms
- two flatten where each represents the number of classes called flatten
the figure illustrate that:
the following functions illustrates the implementation with two arguments the path of the original clean training dataset and path to where save the three elements:
inst.protect_me(paths=paths, save_path_norms=save_folder)
Here are the use in sequece two functions
inst.inspect_attack_one()
, and
inst.inspect_attack_two()
.
This function as the name implies is to inspect the new batch of images if they are compromised in anyway with the first type of attack. it takes two arguments the first one is the path to inspect and the second is the path to the model norms which is discussed above. the requrecruitment of this function using the function:
inst.inspect_attack_one(paths_to_inpect=paths_to_inspect, paths_to_norms=paths_to_norm)
here is figure shows the result with inspecting the folder of which that had been attacked prviously.
This function as the name implies is to inspect the new batch of images if they are compromised in anyway with the second type of attack. two important arguments, first argument is path to the folder that need to be inspected, and the second path to original predictive model that has been proved to be clean orginally. the requrecruitment of this function using the function:
inst.inspect_attack_two(paths_to_inpect=paths_to_inspect, model_verified=updated_trained_model_path)
here is figure shows the result with inspecting the folder of which that had been attacked prviously.
I hope this simulation shows the idea of such attack, its effect and a proposal for defence.
please follow up on the publications on the website to find the academic published paper