FedQAS project

Machine reading comprehension (MRC) of text data is one important task in Natural Language Understanding. It is a complex NLP problem with a lot of ongoing research since the release of the Stanford Question Answering Dataset (SQuAD) and CoQA. It is considered as an effort to teach computers how to "understand" a text, and then to be able to answer questions from it using deep learning. However, until now large-scale training on private text data and knowledge sharing has been missing for this NLP task. In this project, we implemented FedQAS, a privacy-preserving machine reading system that leverages large-scale private data. The implementation combines Transformer models and Federated learning technologies using FEDn framework. The proposed approach can be useful for industries seeking similar solutions and especially where the data are private and cannot be shared. This implementation is inspired from keras.io

Configure and start a client using cpu device

The easiest way to start clients for quick testing is to use shell script.The following shell script will configure and start a client on a blank Ubuntu 20.04 LTS VM:

#!/bin/bash

# Install Docker and docker-compose
sudo apt-get update
sudo sudo snap install docker

# clone the FEDn-client-FedQAS-tf project
git clone https://github.com/aitmlouk/FEDn-client-FedQAS-tf.git
cd FEDn-client-FedQAS-tf

# if no available initial_model, generate a new one 'python client/init_model.py'

# Make sure you have edited extra-hosts.yaml to provide hostname mappings for combiners
# Make sure you have edited client.yaml to provide hostname mappings for reducer
sudo docker-compose -f docker-compose.yaml -f extra-hosts.yaml up --build

Configuring the tests

We have made it possible to configure a couple of settings to vary the conditions for the training. These configurations are exposed in the file 'client/settings.yaml':

# Parameters for the model and local training
max_seq_length: 384
learning_rate: 5e-5
batch_size: 8
epochs: 1 # For demonstration, 3 epochs are recommended
verbose: True

Creating a compute package

To train a model in FEDn you provide the client code (in 'client') as a tarball. For convenience, we ship a pre-made package (nlp_imdb.tar.gz). Whenever you make updates to the client code (such as altering any of the settings in the above mentioned file), you need to re-package the code (as a .tar.gz archive) and copy the updated package to 'packages'.

tar -czvf package.tar.gz client

Creating a seed model

The model architecture is specified in the file 'client/init_model.py'. This script creates an untrained neural network and serialized that to a file, which is uploaded as the initial model for Federated training. For convenience we ship a pregenerated initial model in the 'initial_model/' directory. If you wish to alter the base model, edit 'client/models/squad_model.py' and regenerate the seed file:

# client/models
python init_model.py

Start prediction- Global model serving

We have made it possible to use the trained global model for testing, prediction and annotation, to start the UI make sure that the base_services (fedn/config) is is started and run the flask app (python prediction/app.py)

# prediction/
python app.py

Cite this work

@misc{aitmlouk2022fedqas,
      title={FedQAS: Privacy-aware machine reading comprehension with federated learning}, 
      author={Addi Ait-Mlouk and Sadi Alawadi and Salman Toor and Andreas Hellander},
      year={2022},
      eprint={2202.04742},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

Apache-2.0 (see LICENSE file for full information).

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
client		client
data		data
package		package
prediction		prediction
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
client.yaml		client.yaml
docker-compose.yaml		docker-compose.yaml
extra-hosts.yaml		extra-hosts.yaml
private-network.yaml		private-network.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FedQAS project

Configure and start a client using cpu device

Configuring the tests

Creating a compute package

Creating a seed model

Start prediction- Global model serving

Cite this work

License

About

Releases

Packages

Languages

aitmlouk/FEDn-client-FedQAS-tf

Folders and files

Latest commit

History

Repository files navigation

FedQAS project

Configure and start a client using cpu device

Configuring the tests

Creating a compute package

Creating a seed model

Start prediction- Global model serving

Cite this work

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages