diff --git a/CHANGELOG.md b/CHANGELOG.md index 84bfed38..7e3ca6cf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -27,6 +27,43 @@ docker compose run --rm -it pgbackups /backup.sh (or `docker-compose` if your version of Docker does not support compose v2). +## [2.2.0] 2024-05-08 + +### Changed +- **Breaking change**: When exporting annotations as JSON, the "features" that the annotator entered are no longer nested under `label` ([#347](https://github.com/GateNLP/gate-teamware/issues/347)). Where previously the export would have been + ```json + { + "features": { + "label": { + "field1": "value1" + } + } + } + ``` + + it is now + ```json + { + "features": { + "field1": "value1" + } + } + ``` +- Include details of failed annotations in export formats ([#399](https://github.com/GateNLP/gate-teamware/pull/399)) + - When exporting annotation data from projects (both via the web UI and using the command line tool), + each document includes details of which users _rejected_, _timed out_ or _aborted_ annotation of + that document, as well as the annotation data from the users who completed the document successfully. + This can be useful for the project manager to identify documents that are particularly difficult + to annotate, perhaps suggesting that the annotation guidelines need to be extended or clarified. + +### Fixed +- Upgraded a number of third-party dependencies to close various vulnerabilities ([#397](https://github.com/GateNLP/gate-teamware/pull/397)) +- Fixed several issues relating to the export of annotated data ([#377](https://github.com/GateNLP/gate-teamware/pull/377)) + - "Anonymous" export was not properly anonymous ([#345](https://github.com/GateNLP/gate-teamware/issues/345)) + - Teamware now does a better job of preserving the GATE BDOC JSON structure when exporting documents that were originally uploaded in that format ([#346](https://github.com/GateNLP/gate-teamware/issues/346), [#348](https://github.com/GateNLP/gate-teamware/issues/348)) +- Added an explicit setting for "no email security", as an alternative to the implicit setting when the relevant environment variable is omitted. This is because the implicit setting was lost on upgrades, whereas an explicit "none" will be preserved ([#402](https://github.com/GateNLP/gate-teamware/pull/402)) + + ## [2.1.1] 2023-10-02 ### Added diff --git a/CITATION.cff b/CITATION.cff index 4d5aa1c2..262b417f 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -1,6 +1,6 @@ abstract: A web application for collaborative document annotation. GATE teamware provides a flexible web app platform for managing classification of documents by human annotators. -authors: +authors: - affiliation: The University of Sheffield email: t.karmakharm@sheffield.ac.uk family-names: Karmakharm @@ -33,13 +33,7 @@ keywords: - document annotation license: AGPL-3.0 message: If you use this software, please cite it using the metadata from this file. -repository-code: https://github.com/GateNLP/gate-teamware -title: GATE Teamware -type: software -url: https://gatenlp.github.io/gate-teamware/ -version: 2.1.1 preferred-citation: - type: conference-paper authors: - affiliation: The University of Sheffield email: d.wilby@sheffield.ac.uk @@ -66,14 +60,22 @@ preferred-citation: family-names: Bontcheva given-names: Kalina orcid: https://orcid.org/0000-0001-6152-9600 + collection-title: 'Proceedings of the 17th Conference of the European Chapter of + the Association for Computational Linguistics: System Demonstrations' doi: 10.18653/v1/2023.eacl-demo.17 - title: "GATE Teamware 2: An open-source tool for collaborative document classification annotation" - collection-title: "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations" + end: 151 location: name: Dubrovnik, Croatia - year: 2023 month: 5 - start: 145 - end: 151 publisher: name: Association for Computational Linguistics + start: 145 + title: 'GATE Teamware 2: An open-source tool for collaborative document classification + annotation' + type: conference-paper + year: 2023 +repository-code: https://github.com/GateNLP/gate-teamware +title: GATE Teamware +type: software +url: https://gatenlp.github.io/gate-teamware/ +version: 2.2.0 diff --git a/VERSION b/VERSION index 7c327287..e3a4f193 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -2.1.1 \ No newline at end of file +2.2.0 \ No newline at end of file diff --git a/docs/docs/.vuepress/versions.json b/docs/docs/.vuepress/versions.json index 89bd623d..0623e26d 100644 --- a/docs/docs/.vuepress/versions.json +++ b/docs/docs/.vuepress/versions.json @@ -18,6 +18,14 @@ "text": "2.1.0", "value": "/gate-teamware/2.1.0/" }, + { + "text": "2.1.1", + "value": "/gate-teamware/2.1.1/" + }, + { + "text": "2.2.0", + "value": "/gate-teamware/2.2.0/" + }, { "text": "development", "value": "/gate-teamware/development/" diff --git a/docs/docs/developerguide/README.md b/docs/docs/developerguide/README.md index ef38fc97..02e4494f 100644 --- a/docs/docs/developerguide/README.md +++ b/docs/docs/developerguide/README.md @@ -203,9 +203,13 @@ to the list of environment variables: ```bash DJANGO_EMAIL_BACKEND='django.core.mail.backends.smtp.EmailBackend' DJANGO_EMAIL_HOST='myserver.com' -DJANGO_EMAIL_PORT=22 +DJANGO_EMAIL_PORT=25 DJANGO_EMAIL_HOST_USER='username' DJANGO_EMAIL_HOST_PASSWORD='password' +DJANGO_EMAIL_SECURITY=tls +# tls = STARTTLS, typically on port 25 or 587 +# ssl = TLS-on-connect, typically on port 465 +# none (or omitted) = no encryption ``` #### E-mail using Google API diff --git a/docs/package.json b/docs/package.json index a79d2b60..2e31e446 100644 --- a/docs/package.json +++ b/docs/package.json @@ -1,6 +1,6 @@ { "name": "gate-teamware-docs", - "version": "2.1.1", + "version": "2.2.0", "description": "Documentation for GATE Teamware.", "main": "index.js", "scripts": { diff --git a/docs/versioned/0.3.0/.vuepress/versions.json b/docs/versioned/0.3.0/.vuepress/versions.json index 51082447..135e3576 100644 --- a/docs/versioned/0.3.0/.vuepress/versions.json +++ b/docs/versioned/0.3.0/.vuepress/versions.json @@ -18,6 +18,14 @@ "text": "2.1.0", "value": "/gate-teamware/2.1.0/" }, + { + "text": "2.1.1", + "value": "/gate-teamware/2.1.1/" + }, + { + "text": "2.2.0", + "value": "/gate-teamware/2.2.0/" + }, { "text": "development", "value": "/gate-teamware/development/" diff --git a/docs/versioned/0.4.0/.vuepress/versions.json b/docs/versioned/0.4.0/.vuepress/versions.json index 506c50de..8854f71a 100644 --- a/docs/versioned/0.4.0/.vuepress/versions.json +++ b/docs/versioned/0.4.0/.vuepress/versions.json @@ -18,6 +18,14 @@ "text": "2.1.0", "value": "/gate-teamware/2.1.0/" }, + { + "text": "2.1.1", + "value": "/gate-teamware/2.1.1/" + }, + { + "text": "2.2.0", + "value": "/gate-teamware/2.2.0/" + }, { "text": "development", "value": "/gate-teamware/development/" diff --git a/docs/versioned/2.0.0/.vuepress/versions.json b/docs/versioned/2.0.0/.vuepress/versions.json index 5030337e..652ccdb8 100644 --- a/docs/versioned/2.0.0/.vuepress/versions.json +++ b/docs/versioned/2.0.0/.vuepress/versions.json @@ -18,6 +18,14 @@ "text": "2.1.0", "value": "/gate-teamware/2.1.0/" }, + { + "text": "2.1.1", + "value": "/gate-teamware/2.1.1/" + }, + { + "text": "2.2.0", + "value": "/gate-teamware/2.2.0/" + }, { "text": "development", "value": "/gate-teamware/development/" diff --git a/docs/versioned/2.1.0/.vuepress/versions.json b/docs/versioned/2.1.0/.vuepress/versions.json index 5fb15a42..52e6cd06 100644 --- a/docs/versioned/2.1.0/.vuepress/versions.json +++ b/docs/versioned/2.1.0/.vuepress/versions.json @@ -18,6 +18,14 @@ "text": "2.1.0", "value": "/gate-teamware/2.1.0/" }, + { + "text": "2.1.1", + "value": "/gate-teamware/2.1.1/" + }, + { + "text": "2.2.0", + "value": "/gate-teamware/2.2.0/" + }, { "text": "development", "value": "/gate-teamware/development/" diff --git a/docs/versioned/2.1.1/.vuepress/components/AnnotationRendererPreview.vue b/docs/versioned/2.1.1/.vuepress/components/AnnotationRendererPreview.vue new file mode 100644 index 00000000..98ecc1af --- /dev/null +++ b/docs/versioned/2.1.1/.vuepress/components/AnnotationRendererPreview.vue @@ -0,0 +1,107 @@ + + + + + diff --git a/docs/versioned/2.1.1/.vuepress/components/DisplayVersion.vue b/docs/versioned/2.1.1/.vuepress/components/DisplayVersion.vue new file mode 100644 index 00000000..03ec07ed --- /dev/null +++ b/docs/versioned/2.1.1/.vuepress/components/DisplayVersion.vue @@ -0,0 +1,21 @@ + + + + + diff --git a/docs/versioned/2.1.1/.vuepress/config.js b/docs/versioned/2.1.1/.vuepress/config.js new file mode 100644 index 00000000..b78cc868 --- /dev/null +++ b/docs/versioned/2.1.1/.vuepress/config.js @@ -0,0 +1,42 @@ +const versionData = require("./versions.json") +const path = require("path"); +module.exports = context => ({ + title: 'GATE Teamware Documentation', + description: 'Documentation for GATE Teamware', + base: versionData.base, + themeConfig: { + nav: [ + {text: 'Home', link: '/'}, + {text: 'Annotators', link: '/annotatorguide/'}, + {text: 'Managers & Admins', link: '/manageradminguide/'}, + {text: 'Developer', link: '/developerguide/'} + ], + sidebar: { + '/manageradminguide/': [ + "", + "project_management", + "project_config", + "documents_annotations_management", + "annotators_management" + ], + '/developerguide/': [ + '', + 'frontend', + 'testing', + 'releases', + 'documentation', + "api_docs", + + ], + }, + }, + configureWebpack: { + resolve: { + alias: { + '@': path.resolve(__dirname, versionData.frontendSource) + } + } + }, + + +}) diff --git a/docs/versioned/2.1.1/.vuepress/enhanceApp.js b/docs/versioned/2.1.1/.vuepress/enhanceApp.js new file mode 100644 index 00000000..e7aadadf --- /dev/null +++ b/docs/versioned/2.1.1/.vuepress/enhanceApp.js @@ -0,0 +1,17 @@ +import Vue from 'vue' +import {BootstrapVue, BootstrapVueIcons, IconsPlugin} from 'bootstrap-vue' + +import 'bootstrap/dist/css/bootstrap.css' +import 'bootstrap-vue/dist/bootstrap-vue.css' + +Vue.use(BootstrapVue) +Vue.use(BootstrapVueIcons) + +export default ({ + Vue, // the version of Vue being used in the VuePress app + options, // the options for the root Vue instance + router, // the router instance for the app + siteData // site metadata +}) => { + +} diff --git a/docs/versioned/2.1.1/.vuepress/theme/components/Navbar.vue b/docs/versioned/2.1.1/.vuepress/theme/components/Navbar.vue new file mode 100644 index 00000000..c3b966db --- /dev/null +++ b/docs/versioned/2.1.1/.vuepress/theme/components/Navbar.vue @@ -0,0 +1,143 @@ + + + + + diff --git a/docs/versioned/2.1.1/.vuepress/theme/components/VersionSelector.vue b/docs/versioned/2.1.1/.vuepress/theme/components/VersionSelector.vue new file mode 100644 index 00000000..4cfb5eb9 --- /dev/null +++ b/docs/versioned/2.1.1/.vuepress/theme/components/VersionSelector.vue @@ -0,0 +1,33 @@ + + + + + diff --git a/docs/versioned/2.1.1/.vuepress/theme/index.js b/docs/versioned/2.1.1/.vuepress/theme/index.js new file mode 100644 index 00000000..b91b8a57 --- /dev/null +++ b/docs/versioned/2.1.1/.vuepress/theme/index.js @@ -0,0 +1,3 @@ +module.exports = { + extend: '@vuepress/theme-default' +} diff --git a/docs/versioned/2.1.1/.vuepress/versions.json b/docs/versioned/2.1.1/.vuepress/versions.json new file mode 100644 index 00000000..bd5cf599 --- /dev/null +++ b/docs/versioned/2.1.1/.vuepress/versions.json @@ -0,0 +1,35 @@ +{ + "current": "2.1.1", + "base": "/gate-teamware/2.1.1/", + "versions": [ + { + "text": "0.3.0", + "value": "/gate-teamware/0.3.0/" + }, + { + "text": "0.4.0", + "value": "/gate-teamware/0.4.0/" + }, + { + "text": "2.0.0", + "value": "/gate-teamware/2.0.0/" + }, + { + "text": "2.1.0", + "value": "/gate-teamware/2.1.0/" + }, + { + "text": "2.1.1", + "value": "/gate-teamware/2.1.1/" + }, + { + "text": "2.2.0", + "value": "/gate-teamware/2.2.0/" + }, + { + "text": "development", + "value": "/gate-teamware/development/" + } + ], + "frontendSource": "../../../../frontend/src" +} \ No newline at end of file diff --git a/docs/versioned/2.1.1/README.md b/docs/versioned/2.1.1/README.md new file mode 100644 index 00000000..3586c291 --- /dev/null +++ b/docs/versioned/2.1.1/README.md @@ -0,0 +1,138 @@ +# GATE Teamware + +![GATE Teamware logo](./img/gate-teamware-logo.svg "GATE Teamware logo") + +A web application for collaborative document annotation. + +This is a documentation for Teamware version: + +## Key Features +* Free and open source software. +* Configure annotation options using a highly flexible JSON config. +* Set limits on proportions of a task that annotators can annotate. +* Import existing annotations as CSV or JSON. +* Export annotations as CSV or JSON. +* Annotation instructions and document rendering supports markdown and HTML. + +## Getting started +A quickstart guide for annotators is [available here](annotatorguide). + +To use an existing instance of GATE Teamware as a project manager or admin, find instructions in the [Managers and Admins guide](manageradminguide). + +Documentation on deploying your own instance can be found in the [Developer Guide](developerguide). + +## Installation Guide + +### Quick Start + +The simplest way to deploy your own copy of GATE Teamware is to use Docker Compose on Linux or Mac. Installation on Windows is possible but not officially supported - you need to be able to run `bash` shell scripts for the quick-start installer. + +1. Install Docker - [Docker Engine](https://docs.docker.com/engine/) for Linux servers or [Docker Desktop](https://docs.docker.com/desktop/) for Mac. +2. Install [Docker Compose](https://github.com/docker/compose), if your Docker does not already include it (Compose is included by default with Docker Desktop) +3. Download the [installation script](https://gate.ac.uk/get-teamware.sh) into an empty directory, run it and follow the instructions. + +``` +mkdir gate-teamware +cd gate-teamware +curl -LO https://gate.ac.uk/get-teamware.sh +bash ./get-teamware.sh +``` + +This will make the Teamware application available as `http://localhost:8076`, with the option to expose it as a public `https://` URL if your server is directly internet-accessible - for production use we recommend deploying Teamware with a suitable internet-facing reverse proxy, or use Kubernetes as described below. + +### Deployment using Kubernetes + +A Helm chart to deploy Teamware on Kubernetes is published to the GATE team public charts repository. The chart requires [Helm](https://helm.sh) version 3.7 or later, and is compatible with Kubernetes version 1.23 or later. Earlier Kubernetes versions back to 1.19 _may_ work provided autoscaling is not enabled, but these have not been tested. + +The following quick start instructions assume you have a compatible Kubernetes cluster and a working installation of `kubectl` and `helm` (3.7 or later) with permission to create all the necessary resource types in your target namespace. + +First generate a random "secret key" for the Django application. This must be at least 50 random characters, a quick way to do this is + +``` +# 42 random bytes base64 encoded becomes 56 random characters +kubectl create secret generic -n {namespace} django-secret \ + --from-literal="secret-key=$( openssl rand -base64 42 )" +``` + +Add the GATE charts repository to your Helm configuration: + +``` +helm repo add gate https://repo.gate.ac.uk/repository/charts +helm repo update +``` + +Create a `values.yaml` file with the key settings required for teamware. The following is a minimal set of values for a typical installation: + +```yaml +# Public-facing web hostname of the teamware application, the public +# URL will be https://{hostName} +hostName: teamware.example.com + +email: + # "From" address on emails sent by Teamware + adminAddress: admin@teamware.example.com + # Send email via an SMTP server - alternatively "gmail" to use GMail API + backend: "smtp" + smtp: + host: mail.example.com + # You will also need to set user and passwordSecret if your + # mail server requires authentication + +privacyPolicy: + # Contact details of the host and administrator of the teamware + # instance, if no admin defined, defaults to the host values. + host: + # Name of the host + name: "Service Host" + # Host's physical address + address: "123 Example Street, City. Country." + # A method of contacting the host, field supports HTML for e.g. linking to a form + contact: "Email" + admin: + name: "Dr. Service Admin" + address: "Department of Example Studies, University of Example, City. Country." + contact: "Email" + +backend: + # Name of the random secret you created above + djangoSecret: django-secret + +# Initial "super user" created on the first install. These are just +# the *initial* settings, you can (and should!) change the password +# once Teamware is up and running +superuser: + email: me@example.com + username: admin + password: changeme +``` + +Some of these may be omitted or others may be required depending on the setup of your specific cluster - see the [chart README](https://github.com/GateNLP/charts/blob/main/gate-teamware/README.md) and the chart's own values file (which you can retrieve with `helm show values gate/gate-teamware`) for full details. In particular these values assume: + +- your cluster has an ingress controller, with a default ingress class configured, and that controller has a default TLS certificate that is compatible with your chosen hostname (e.g. a `*.example.com` wildcard) +- your cluster has a default storageClass configured to provision PVCs, and at least 8 GB of available PV capacity +- you can send email via an SMTP server with no authentication +- the default GATE Teamware terms and privacy documents are suitable for your deployment and compliant with the laws of your location. If this is not the case you can supply your own custom policy documents in a ConfigMap +- you do not need to back up your PostgreSQL database - the chart does include the option to store backups in Amazon S3 or another compatible object store, see the full README for details + +Once you have created your values file, you can install the chart or upgrade an existing installation using + +``` +helm upgrade --install gate-teamware gate/gate-teamware \ + --namespace {namespace} --values teamware-values.yaml +``` + + +## Bug reports and feature requests +Please make bug reports and feature requests as Issues on the [GATE Teamware GitHub repo](https://github.com/GATENLP/gate-teamware). + +# Using Teamware +Teamware is developed by the [GATE](https://gate.ac.uk) team, an academic research group at The University of Sheffield. As a result, future funding relies on evidence of the impact that the software provides. If you use Teamware, please let us know using the contact form at [gate.ac.uk](https://gate.ac.uk/g8/contact). Please include details on grants, publications, commercial products etc. Any information that can help us to secure future funding for our work is greatly appreciated. + +## Citation +For published work that has used Teamware, please cite this repository. One way is to include a citation such as: + +> Karmakharm, T., Wilby, D., Roberts, I., & Bontcheva, K. (2022). GATE Teamware (Version 0.1.4) [Computer software]. https://github.com/GateNLP/gate-teamware + +Please use the `Cite this repository` button at the top of the [project's GitHub repository](https://github.com/GATENLP/gate-teamware) to get an up to date citation. + +The Teamware version can be found on the 'About' page of your Teamware instance. diff --git a/docs/versioned/2.1.1/annotatorguide/README.md b/docs/versioned/2.1.1/annotatorguide/README.md new file mode 100644 index 00000000..4a3ddae2 --- /dev/null +++ b/docs/versioned/2.1.1/annotatorguide/README.md @@ -0,0 +1,27 @@ +# Annotators Quickstart + +Annotating a project: + +* After signing up to the site, notify the owner of the annotation project you've been recruited of + your username. This will allow them to add you as an annotator to a project. +* After you've been recruited to a project, click on the `Annotate` link on the navigation bar at the + top of the page to start annotating. +* You will be shown the details about the project you're annotating along with a set of form(s) to capture + your annotation. Ensure you've read the Annotator guideline fully before starting the annotation process. +* You can then start annotating documents one at a time. Click on `Submit` to confirm the completion of + annotation, `Clear` to start again or `Reject` to skip the particular document. Be aware some projects + do not allow you to skip documents. +* Once you've finished annotating a certain number of documents in a project (specified by the project + manager) your task will be deemed complete, and you will be able to be recruited into another annotation + project. + +## Deleting your account + +At any time you can choose to stop participating and delete your account. You can do this by: + +* Click on your username in the top right corner and then `Account`. +* Click on `Delete my account`. +* When deleting your account, by default your personal information will be removed but your annotations will remain on the system. To completely remove all of your annotations, click on the checkbox next to `Also remove any annotations, projects and documents that I own:`. +* Click the `Unlock` button. +* Then click `Delete` to remove your account. + diff --git a/docs/versioned/2.1.1/developerguide/README.md b/docs/versioned/2.1.1/developerguide/README.md new file mode 100644 index 00000000..ef38fc97 --- /dev/null +++ b/docs/versioned/2.1.1/developerguide/README.md @@ -0,0 +1,282 @@ +# Developer guide + +## Architecture +``` +├── .github/workflows/ # github actions workflow files +├── teamware/ # Django project +│   └── settings/ +├── backend/ # Django app +├── cypress/ # integration test configurations +├── docs/ # documentation +├── examples/ # example data files +├── frontend/ # all frontend, in VueJS framework +├── nginx/ # Nginx configurations +| +# Top level directory contains scripts for management and deployment, +# main project package.json, python requirements, docker configs +├── build-images.sh +├── deploy.sh +├── create-django-db.sh +├── docker-compose.yml +├── Dockerfile +├── generate-docker-env.sh +├── manage.py +├── migrate-integration.sh +├── package.json +├── package-lock.json +├── pytest.ini +├── README.md +├── requirements-dev.txt +├── requirements.txt +└── run-server.sh + +``` + +## Installation for development + +The service depends on a combination of python and javascript libraries. We recommend developing inside a `conda` conda environment as it is able to install +python libraries and nodejs which is used to install javascript libraries. + +* Install anaconda/miniconda +* Create a blank virtual conda env + ```bash + $ conda create -n teamware python=3.9 + ``` +* Activate conda environment + ```bash + $ source activate teamware + # or + $ conda activate teamware + ``` +* Install python dependencies in conda environment using pip + ```bash + (teamware)$ pip install -r requirements.txt -r requirements-dev.txt + ``` +* Install nodejs, postgresql and openssl in the conda environment + ```bash + (teamware)$ conda install -y -c conda-forge postgresql=14.* + (teamware)$ conda install -y -c conda-forge nodejs=18.* + ``` +* Install nodejs dependencies + ```bash + (teamware)$ npm install + ``` + +Set up a new postgreSQL database and user for development: +``` +# Create a new directory for the db data and initialise +mkdir -p pgsql/data +initdb -D pgsql/data + +# Launch postgres in the background +postgres -p 5432 -D pgsql/data & + +# Create a DB user, you'll be prompted to input password, "password" is the default in teamware/settings/base.py for development +createuser -p 5432 -P user --createdb + +# Create a rumours_db with rumours as user +createdb -p 5432 -O user teamware_db + +# Migrate & create database tables +python manage.py migrate + +# create a new superuser - when prompted enter a username and password for the db superuser +python manage.py createsuperuser +``` + +## Updating packages +To update packages after a merge, run the following commands: + +```bash +# Activate the conda environment +source activate teamware +# Update any packages changed in the python requirements.txt and requirements-dev.txt files +pip install -r requirements.txt -r requirements-dev.txt +# Update any packages changed in package.json +npm install +``` + +## Development server +The application uses django's dev server to serve page contents and run the RPC API, it also uses Vue CLI's +development server to serve dynamic assets such as javascript or stylesheets allowing for hot-reloading +during development. + +To run both servers together: + + ```bash + npm run serve + ``` + +To run separately: + +* Django server + ```bash + npm run serve:backend + ``` +* Vue CLI dev server + ```bash + npm run serve:frontend + ``` + +## Deploying a development version using Docker +Deployment is via [docker-compose](https://docs.docker.com/compose/), using [NGINX](https://www.nginx.com/) to serve static content, a separate [postgreSQL](https://hub.docker.com/_/postgres) service containing the database and a database backup service (see `docker-compose.yml` for details). Pre-built images can be run using most versions of Docker but _building_ images requires `docker buildx`, which means either Docker Desktop or version 19.03 or later of Docker Engine. + +1. Run `./generate-docker-env.sh` to create a `.env` file containing randomly generated secrets which are mounted as environment variables into the container. See [below](#env-config) for details. + +2. Then build the images via: + ```bash + ./build-images.sh + ``` + +3. then deploy the stack with + + ```bash + ./deploy.sh production # (or prod) to deploy with production settings + ./deploy.sh staging # (or stag) to deploy with staging settings + ``` + +To bring the stack down, run `docker-compose down`, using the `-v` flag to destroy the database volume (be careful with this). + +### Configuration using environment variables (.env file) + +To allow the app to be easily configured between instances especially inside containers, many of the app's configuration can be done through environment variables. + +Run `./generate-docker-env.sh` to generate a `.env` file with all configurable environment parameters. + +To set values for your own deployment, add values to the variables in `.env`, most existing values will be kept after running `generate-docker-env.sh`, see comments in `.env` for specific details. Anything that is left blank will be filled with a default value. Passwords and keys are filled with auto-generated random values. + +Existing `.env` files are copied into a new file named `saved-env.` by `generate-docker-env.sh`. + +### Backups + +In a docker-compose based deployment, backups of the database are managed by the service `pgbackups` which uses the [`prodrigestivill/postgres-backup-local:12`](https://hub.docker.com/r/prodrigestivill/postgres-backup-local) image. +By default, backups are taken of the database daily, and the `docker-compose.yml` contains settings for the number of backups kept under the options for the `pgbackups` service. +Backups are stored as a gzipped SQL dump from the database. + +#### Taking a manual backup + +A shell script is provided for manually triggering a backup snapshot. +From the main project directory run + +```sh +$ ./backup_manual.sh +``` + +This uses the `pgbackups` service and all settings and envrionment variables it is configured with in `docker-compose.yml`, so backups will be taken to the same location as configured for the main backup schedule. + +#### Restoring from a backup +1. Locate the backup file (`*.sql.gz`) on your system that you would like to restore from. +2. Make sure that the stack is down, from the main project directory run `docker-commpose down`. +3. Run the backup restore shell script, passing in the path to your backup file as the only argument: + +```sh +$ ./backup_restore.sh path/to/my/backup.sql.gz +``` + +This will first launch the database container, then via Django's `dbshell` command, running in the `backend` service, execute a number of SQL commands before and after running all the SQL from the backup file. + +4. Redeploy the stack, via `./deploy.sh staging`, `./deploy.sh production`, or simply `docker compose up -d`, whichever is the case. +5. The database *should* be restored. + +## Configuration + +### Django settings files + +Django settings are located in `teamware/settings` folder. The app will use `base.py` setting by default +and this must be overridden depending on use. + +### Database +A SQLite3 database is used during development and during integration testing. + +For staging and production, postgreSQL is used, running from a `postgres-14` docker container. Settings are found in `teamware/settings/base.py` and `deployment.py` as well as being set as environment variables by `./generate-docker-env.sh` and passed to the container as configured in `docker-compose.yml`. + +In Kubernetes deployments the PostgreSQL database is installed using the Bitnami `postresql` public chart. + + +### Sending E-mail +It's recommended to specify e-mail configurations through environment variables (`.env`). As these settings will include username and passwords that should not be tracked by version control. + +#### E-mail using SMTP +SMTP is supported as standard in Django, add the following configurations with your own details +to the list of environment variables: + +```bash +DJANGO_EMAIL_BACKEND='django.core.mail.backends.smtp.EmailBackend' +DJANGO_EMAIL_HOST='myserver.com' +DJANGO_EMAIL_PORT=22 +DJANGO_EMAIL_HOST_USER='username' +DJANGO_EMAIL_HOST_PASSWORD='password' +``` + +#### E-mail using Google API +The [django-gmailapi-backend](https://github.com/dolfim/django-gmailapi-backend) library +has been added to allow sending of mail through Google's API as sending through SMTP is disabled as standard. + +Unlike with SMTP, Google's API requires OAuth authentication which means a project and a credential has to be +created through Google's cloud console. + +* More information on the Gmail API: [https://developers.google.com/gmail/api/guides/sending](https://developers.google.com/gmail/api/guides/sending) +* OAuth credentials for sending emails: [https://github.com/google/gmail-oauth2-tools/wiki/OAuth2DotPyRunThrough](https://github.com/google/gmail-oauth2-tools/wiki/OAuth2DotPyRunThrough) + +This package includes the script linked in the documentation above, which simplifies the setup of the API credentials. The following outlines the key steps: + +1. Create a project in the Google developer console, [https://console.cloud.google.com/](https://console.cloud.google.com/) +2. Enable the Gmail API +3. Create OAuth 2.0 credentials, you'll likely want to create a `Desktop` +4. Create a valid refresh_token using the helper script included in the package: + ```bash + gmail_oauth2 --generate_oauth2_token \ + --client_id="" \ + --client_secret="" \ + --scope="https://www.googleapis.com/auth/gmail.send" + ``` +5. Add the created credentials and tokens to the environment variable as shown below: + ```bash + DJANGO_EMAIL_BACKEND='gmailapi_backend.mail.GmailBackend' + DJANGO_GMAIL_API_CLIENT_ID='google_assigned_id' + DJANGO_GMAIL_API_CLIENT_SECRET='google_assigned_secret' + DJANGO_GMAIL_API_REFRESH_TOKEN='google_assigned_token' + ``` + + +#### Teamware Privacy Policy and Terms & Conditions + +Teamware includes a default privacy policy and terms & conditions, which are required for running the application. + +The default privacy policy is intended to be compliant with UK GDPR regulations, which may comply with the rights of users of your deployment, however it is your responsibility to ensure that this is the case. + +If the default privacy policy covers your use case, then you will need to include configuration for a few contact details. + +Contact details are required for the **host** and the **administrator**: the **host** is the organisation or individual responsible for managing the deployment of the teamware instance and the **administrator** is the organisation or individual responsible for managing users, projects and data on the instance. In many cases these roles will be filled by the same organisation or individual, so in this case specifying just the **host** details is sufficient. + +For deployment from source, set the following environment variables: + +* `PP_HOST_NAME` +* `PP_HOST_ADDRESS` +* `PP_HOST_CONTACT` +* `PP_ADMIN_NAME` +* `PP_ADMIN_ADDRESS` +* `PP_ADMIN_CONTACT` + +For deployment using docker-compose, set these values in `.env`. + +If the host and administrator are the same, you can just set the `PP_HOST_*` variables above which will be used for both. + +##### Including a custom Privacy Policy and/or Terms & Conditions + +If the default privacy policy or terms & conditions do not cover your use case, you can easily replace these with your own documents. + +If deploying from source, include markdown (`.md`) files in a `custom-policies` directory in the project root with the exact names `custom-policies/privacy-policy.md` and/or `custom-policies/terms-and-conditions.md` which will be rendered at the corresponding pages on the running web app. If you are not familiar with the Markdown language there are a number of free WYSIWYG-style editor tools available including [StackEdit](https://stackedit.io/app) (browser based) and [Zettlr](https://www.zettlr.com) (desktop app). + +If deploying with docker compose, place the `custom-policies` directory at the same location as the `docker-compose.yml` file before running `./deploy.sh` as above. + +An example custom privacy policy file contents might look like: + +```md +# Organisation X Teamware Privacy Policy +... +... +## Definitions of Roles and Terminology +... +... +``` diff --git a/docs/versioned/2.1.1/developerguide/api_docs.md b/docs/versioned/2.1.1/developerguide/api_docs.md new file mode 100644 index 00000000..03964b37 --- /dev/null +++ b/docs/versioned/2.1.1/developerguide/api_docs.md @@ -0,0 +1,1086 @@ +--- +sidebarDepth: 3 +--- + +# API Documentation + +## Using the JSONRPC endpoints + +::: tip +A single endpoint is used for all API requests, located at `/rpc` +::: + +The API used in the app complies to JSON-RPC 2.0 spec. Requests should always be sent with `POST` and +contain a JSON request object in the body. The response will also be in the form of a JSON object. + +For example, to call the method `subtract(a, b)`. Send `POST` a post request to `/rpc` with the following JSON +in the body: + +```json +{ + "jsonrpc":"2.0", + "method":"subtract", + "params":[ + 42, + 23 + ], + "id":1 +} +``` + +Variables are passed as a list to the `params` field, in this case `a=42` and `b=23`. The `id` field in the top +level of the request object refers to the message ID, this ID value will be matched in the response, +it does not affect the method that is being called. + +The response will be as follows: + +```json +{ + "jsonrpc":"2.0", + "result":19, + "id":1 +} +``` + +In the case of errors, the response will contain an `error` field with error `code` and error `message`: + +```json +{ + "jsonrpc":"2.0", + "error":{ + "code":-32601, + "message":"Method not found" + }, + "id":"1" +} +``` + +The following are error codes used in the app: + +```python +PARSE_ERROR = -32700 +INVALID_REQUEST = -32600 +METHOD_NOT_FOUND = -32601 +INVALID_PARAMS = -32602 +INTERNAL_ERROR = -32603 +AUTHENTICATION_ERROR = -32000 +UNAUTHORIZED_ERROR = -32001 +``` + +## API Listing + + + +### initialise() + + +::: tip Description +Provide the initial context information to initialise the Teamware app + + context_object: + user: + isAuthenticated: bool + isManager: bool + isAdmin: bool + configs: + docFormatPref: bool + global_configs: + allowUserDelete: bool +::: + + + + + + + + +### is_authenticated() + + +::: tip Description +Checks that the current user has logged in. +::: + + + + + + + + +### login(payload) + + + + +#### Parameters + +* payload + + + + + + + +### logout() + + + + + + + + + +### register(payload) + + + + +#### Parameters + +* payload + + + + + + + +### generate_user_activation(username) + + + + +#### Parameters + +* username + + + + + + + +### activate_account(username,token) + + + + +#### Parameters + +* username + +* token + + + + + + + +### generate_password_reset(username) + + + + +#### Parameters + +* username + + + + + + + +### reset_password(username,token,new_password) + + + + +#### Parameters + +* username + +* token + +* new_password + + + + + + + +### change_password(payload) + + + + +#### Parameters + +* payload + + + + + + + +### change_email(payload) + + + + +#### Parameters + +* payload + + + + + + + +### set_user_receive_mail_notifications(do_receive_notifications) + + + + +#### Parameters + +* do_receive_notifications + + + + + + + +### set_user_document_format_preference(doc_preference) + + + + +#### Parameters + +* doc_preference + + + + + + + +### get_user_details() + + + + + + + + + +### get_user_annotated_projects() + + +::: tip Description +Gets a list of projects that the user has annotated +::: + + + + + + + + +### get_user_annotations_in_project(project_id,current_page,page_size) + + +::: tip Description +Gets a list of documents in a project where the user has performed annotations in. + :param project_id: The id of the project to query + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* project_id + +* current_page + +* page_size + + + + + + + +### user_delete_personal_information() + + + + + + + + + +### user_delete_account() + + + + + + + + + +### create_project() + + + + + + + + + +### delete_project(project_id) + + + + +#### Parameters + +* project_id + + + + + + + +### update_project(project_dict) + + + + +#### Parameters + +* project_dict + + + + + + + +### get_project(project_id) + + + + +#### Parameters + +* project_id + + + + + + + +### clone_project(project_id) + + + + +#### Parameters + +* project_id + + + + + + + +### import_project_config(pk,project_dict) + + + + +#### Parameters + +* pk + +* project_dict + + + + + + + +### export_project_config(pk) + + + + +#### Parameters + +* pk + + + + + + + +### get_projects(current_page,page_size,filters) + + +::: tip Description +Gets the list of projects. Query result can be limited by using current_page and page_size and sorted + by using filters. + + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :param filters: Filter option used to search project, currently only string is used to search + for project title + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* current_page + +* page_size + +* filters + + + + + + + +### get_project_documents(project_id,current_page,page_size,filters) + + +::: tip Description +Gets the list of documents and its annotations. Query result can be limited by using current_page and page_size + and sorted by using filters + + :param project_id: The id of the project that the documents belong to, is a required variable + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :param filters: Filter currently only searches for ID of documents + for project title + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* project_id + +* current_page + +* page_size + +* filters + + + + + + + +### get_project_test_documents(project_id,current_page,page_size,filters) + + +::: tip Description +Gets the list of documents and its annotations. Query result can be limited by using current_page and page_size + and sorted by using filters + + :param project_id: The id of the project that the documents belong to, is a required variable + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :param filters: Filter currently only searches for ID of documents + for project title + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* project_id + +* current_page + +* page_size + +* filters + + + + + + + +### get_project_training_documents(project_id,current_page,page_size,filters) + + +::: tip Description +Gets the list of documents and its annotations. Query result can be limited by using current_page and page_size + and sorted by using filters + + :param project_id: The id of the project that the documents belong to, is a required variable + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :param filters: Filter currently only searches for ID of documents + for project title + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* project_id + +* current_page + +* page_size + +* filters + + + + + + + +### add_project_document(project_id,document_data) + + + + +#### Parameters + +* project_id + +* document_data + + + + + + + +### add_project_test_document(project_id,document_data) + + + + +#### Parameters + +* project_id + +* document_data + + + + + + + +### add_project_training_document(project_id,document_data) + + + + +#### Parameters + +* project_id + +* document_data + + + + + + + +### add_document_annotation(doc_id,annotation_data) + + + + +#### Parameters + +* doc_id + +* annotation_data + + + + + + + +### get_annotations(project_id) + + +::: tip Description +Serialize project annotations as GATENLP format JSON using the python-gatenlp interface. +::: + + + +#### Parameters + +* project_id + + + + + + + +### delete_documents_and_annotations(doc_id_ary,anno_id_ary) + + + + +#### Parameters + +* doc_id_ary + +* anno_id_ary + + + + + + + +### get_possible_annotators(proj_id) + + + + +#### Parameters + +* proj_id + + + + + + + +### get_project_annotators(proj_id) + + + + +#### Parameters + +* proj_id + + + + + + + +### add_project_annotator(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### make_project_annotator_active(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### project_annotator_allow_annotation(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### remove_project_annotator(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### reject_project_annotator(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### get_annotation_timings(proj_id) + + + + +#### Parameters + +* proj_id + + + + + + + +### delete_annotation_change_history(annotation_change_history_id) + + + + +#### Parameters + +* annotation_change_history_id + + + + + + + +### get_annotation_task() + + +::: tip Description +Gets the annotator's current task, returns a dictionary about the annotation task that contains all the information + needed to render the Annotate view. +::: + + + + + + + + +### get_annotation_task_with_id(annotation_id) + + +::: tip Description +Get annotation task dictionary for a specific annotation_id, must belong to the annotator (or is a manager or above) +::: + + + +#### Parameters + +* annotation_id + + + + + + + +### complete_annotation_task(annotation_id,annotation_data,elapsed_time) + + +::: tip Description +Complete the annotator's current task +::: + + + +#### Parameters + +* annotation_id + +* annotation_data + +* elapsed_time + + + + + + + +### reject_annotation_task(annotation_id) + + +::: tip Description +Reject the annotator's current task +::: + + + +#### Parameters + +* annotation_id + + + + + + + +### change_annotation(annotation_id,new_data) + + +::: tip Description +Adds annotation data to history +::: + + + +#### Parameters + +* annotation_id + +* new_data + + + + + + + +### get_document(document_id) + + +::: tip Description +Obsolete: to be deleted +::: + + + +#### Parameters + +* document_id + + + + + + + +### get_annotation(annotation_id) + + +::: tip Description +Obsolete: to be deleted +::: + + + +#### Parameters + +* annotation_id + + + + + + + +### annotator_leave_project() + + +::: tip Description +Allow annotator to leave their currently associated project. +::: + + + + + + + + +### get_all_users() + + + + + + + + + +### get_user(username) + + + + +#### Parameters + +* username + + + + + + + +### admin_update_user(user_dict) + + + + +#### Parameters + +* user_dict + + + + + + + +### admin_update_user_password(username,password) + + + + +#### Parameters + +* username + +* password + + + + + + + +### admin_delete_user_personal_information(username) + + + + +#### Parameters + +* username + + + + + + + +### admin_delete_user(username) + + + + +#### Parameters + +* username + + + + + + + +### get_privacy_policy_details() + + + + + + + + + +### get_endpoint_listing() + + + + + + + + + + + diff --git a/docs/versioned/2.1.1/developerguide/documentation.md b/docs/versioned/2.1.1/developerguide/documentation.md new file mode 100644 index 00000000..3de09904 --- /dev/null +++ b/docs/versioned/2.1.1/developerguide/documentation.md @@ -0,0 +1,61 @@ +# Managing and versioning documentation + +Documentation versioning is managed by the custom node script located at `docs/manage_versions.js`. Versions of the documentation can be archived and the entire documentation site can be built using the script. + +Various configuration parameters used for management of documentation versioning can be found in `docs/docs.config.js`. + +## Installing dependencies required to serve the documentation site + +The documentation uses vuepress and other libraries which has to be installed separately running the following command from the root of the project: + +```bash +npm run install:docs +``` + +## Editing the documentation + +The latest version of the documentation is located at `/docs/docs`. The archived (versioned) documentation are located in `/docs/versioned/version_number`. + +Use the following command to live preview the latest version of the documentation: + +``` +npm run serve:docs +``` + +Note that this will not work with other versioned docs as they are managed as a separate site. To live preview versioned documentation use the command (replace version_num with the version you'd like to preview): + +``` +vuepress dev docs/versioned/version_num +``` + +## Creating a new documentation version + +To create a version of the documentation, run the command: + +``` +npm run docs:create_version +``` + +This creates a copy of the current set of documentation in `/docs/docs` and places it at `/docs/versioned/version_num`. The version number in `package.json` is used for the documentation version. + +Each set of documentation can be considered as a separate vuepress site. Each one has a `.vuepress/versions.json` file that contains the listing of all versions, allowing them to link to each other. + +Note: Versions can also be created manually by running the command: + +``` +# Replace version_num with the version you'd like to create +node docs/manage_versions.js create version_num +``` + + +## Building documentation site + +To build the documentation site, the previous documentation build command is used: + +``` +npm run build:docs +``` + +## Implementation of the version selector UI + +A partial override of the default Vuepress theme was needed to add a custom component the navigation bar. The modified version of the `NavBar` component can be found in `/docs/docs/.vuepress/theme/components/NavBar.vue`. The modified NavBar uses the `VersionSelector` (`/docs/docs/.vuepress/theme/components/VersionSelector.vue`) component which reads from the `.vuepress/versions.json` from each set of documentation. diff --git a/docs/versioned/2.1.1/developerguide/frontend.md b/docs/versioned/2.1.1/developerguide/frontend.md new file mode 100644 index 00000000..ae2aeff0 --- /dev/null +++ b/docs/versioned/2.1.1/developerguide/frontend.md @@ -0,0 +1,146 @@ +# Frontend + +Web GUI of Teamware is built with [vue.js](https://vuejs.org) version 2.7.x. + +[Bootstrap](https://getbootstrap.com/) (and [Bootstrap vue](https://bootstrap-vue.org/)) provides the visual styling. + +[Vite.js](https://vitejs.dev/) is used to bundle Vue code and other javascript dependencies for deployment and serve as a frontend dev server (which runs alongside django dev server) while testing or debugging. + +## Getting started + +### Installation +``` +npm install +``` + +### Compiles and hot-reloads for development +``` +npm run serve +``` + +### Compiles and minifies for production +``` +npm run build +``` + +### Testing + +**Tools used for testing:** +* [vitest](https://vitest.dev) - Used for unit testing (code without UI components) +* [cypress](https://docs.cypress.io) - Used for tests that contains (Vue) UI components +* [Vue test utils](https://vue-test-utils.vuejs.org) - Used for rendering vue component allows it to be mounted for unit testing. Officially recommended by Vue.js. + +* Tests for the frontend are all located in `/frontend/tests` folder. + * Unit test files should all be placed in `/frontend/tests/unit/` folder and have an extension `.spec.js`. + * Component test files should all be placed in `/frontend/tests/component` folder and have an extension `.cy.js` +* Test fixtures (data used in running the tests) are placed in `/examples` folder, this folder is shared with the integration test + +To run all frontend tests (unit and component tests): + +``` +npm run test +``` + +To run unit tests only: + +``` +npm run test:unit +``` + +To run component test only: + +``` +npm run test:component +``` + +## Notes when coming from the previous version <=2.0.0 + +- The `@` alias can still be used when doing module imports but file extensions should now be used when importing `.vue` files e.g. + - Before: `import DeleteModal from "@/components/DeleteModal" + - Now: `import DeleteModal from "@/components/DeleteModal.vue"` +- For code that is intended to run on the browser, e.g. in all `.vue` files, imports should use the ES 6 compliant `import` command and not node/commonjs's `require` + - **Exceptions for code that is run directly by node**, e.g. scripts used in the build chain, config files and test files used by build tools that run on node (e.g. vuepress or cypress) + + +## Explantion of the frontend + +### Vue and Vite + +Instead of separating html, css and javascript files, Vue has its own `single-file component` format normally with `.vue` extension ([reason why this file format is used](https://vuejs.org/guide/scaling-up/sfc.html)). Here is an example `.vue` file: + +```vue + + + + + +``` + +This means that `.vue` files cannot be directly imported into a standard html page. A tool has to be used for converting `.vue` file into standard javascript and/or css files, this is where [Vite.js](https://vitejs.dev/) comes in. + +[Vite.js](https://vitejs.dev/) is a tool that, amongst many other things, provides a dev server allowing hot module replacement (ability to immediately see changes in the UI during development) and bundling of javascript modules and other resources (css, images, etc.) i.e. not having to individually import each javascript and their dependencies from the main page. A [Vue plugin](https://github.com/vitejs/vite-plugin-vue2) is used to automatically convert `.vue` files into plain javascript as part of the bundling process. + +### App entrypoint (main.js) and routing + +The application's main entrypoint is `/frontend/src/main.js` which loads dependencies like Vue, Bootstrap Vue as well as loading the main component `AnnotationApp.vue` into a html page that contains a `
` tag. + +The `AnnotationApp.vue` component contains the special `` tag ([vue router](https://router.vuejs.org/)) which allows us to map url paths to specific vue components. The routing configuration can be found in `/frontend/src/router/index.js`, for example: + +```js +const routes = [ + { + path: '/', + name: 'Home', + component: Home, + meta: {guest: true}, + }, +... +``` + +The route shown above maps the root path e.g. `https://your-deployed-teamware-domain.com/` to the `Home.vue` component. Specifically, when pointing your browser to that path, the `Home.vue` component is inserted inside ``. + +### index.html, templates and bundling + +A html page is required to place our application in. Teamware uses Django to serve up the main html page which is located at `/frontend/templates/index.html` (see `MainView` class in `/backend/views.py`). This `index.html` page has to know where to load the generated javascript files. Where these files are differ depending on whether you're running the vite development server or using vite's statically built files. + +#### Using vite's development server (Django's `settings.FRONTEND_DEV_SERVER_USE` is `True`) +In during development we expect to be running the vite dev server alongside django server (when running `npm run serve` from the root of the project). In this case `index.html` imports javascript directly from the vite dev server: + +```html + + +``` + +This applies when running the `base`, `test` and `integration` django configurations. + +#### Using vite's statically built assets (Django's `settings.FRONTEND_DEV_SERVER_USE` is `false`) +When deploying the application, vite converts `.vue` files into plain javascript and bundles them to `/frontend/dist/static` directory. The `/frontend/src/main.js` becomes `/frontend/dist/static/assets/main-bb58d055.js`. The scripts are imported as static asset of going through the vite server, for example: + +```html + + +``` + +This applies when running the `deployment`, `docker-test` and `docker-integration` django configurations. + +#### index.html generation + +You may have noticed that a hash is added to the generated asset files (e.g. `main-bb58d055.js`) and this hash changes every time Vite builds the code. This means the `index.html` must also be re-generated after every Vite build as well. + +A simple build script which runs after every vite build `/frontend/build_template.js` performs this generation by taking the base template `/frontend/base_index.html`, merging it with Vite's generated manifest `/frontend/dist/manifest.json` and the output with the correct import path to `/frontend/templates/index.html`. + diff --git a/docs/versioned/2.1.1/developerguide/releases.md b/docs/versioned/2.1.1/developerguide/releases.md new file mode 100644 index 00000000..8a08d12b --- /dev/null +++ b/docs/versioned/2.1.1/developerguide/releases.md @@ -0,0 +1,18 @@ +# Managing Releases + +*These instructions are primarily intended for the maintainers of Teamware.* + +Note: Releases are always made from the `master` branch of the repository. + +## Steps to making a release + +1. **Update the changelog** - This has to be done manually, go through any pull requests to `dev` since the last release. + - In github pull requests page, use the search term `is:pr merged:>=yyyy-mm-dd` to find all merged PR from the date since the last version change. + - Include the changes in the `CHANGELOG.md` file; the changelog section _MUST_ begin with a level-two heading that starts with the relevant version number in square brackets (`## [N.M.P] Optional descriptive suffix`) as the GitHub workflow that creates a release from the eventual tag depends on this pattern to find the right release notes. Each main item within the changelog should have a link to the originating PR e.g. \[#123\](https://github.com/GateNLP/gate-teamware/pull/123). +1. **Update and check the version numbers** - from the teamware directory run `python version.py check` to check whether all version numbers are up to date. If not, update the master `VERSION` file and run `python version.py update` to update all other version numbers and commit the result. Alternatively, run `python version.py update ` where `` is the version number to update to, e.g. `python version.py update 2.1.0`. Note that `version.py` requires `pyyaml` for reading `CITATION.cff`, `pyyaml` is included in Teamware's dependencies. +1. **Create a version of the documentation** - Run `npm run docs:create_version`, this will archive the current version of the documentation using the version number in `package.json`. +1. **Create a pull request from `dev` to `master`** including any changes to `CHANGELOG.md`, `VERSION`. +1. **Create a tag** - Once the dev-to-master pull request has been merged, create a tag from the resulting `master` branch named `vN.M.P` (i.e. the new version number prefixed with the letter `v`). This will trigger two GitHub workflows: + - one that builds versioned Docker images for this release and pushes them to `ghcr.io`, updating the `latest` image tag to point to the new release + - one that creates a "release" on GitHub with the necessary artifacts to make the `https://gate.ac.uk/get-teamware.sh` installation mechanism work correctly. The release notes for this release will be generated by extracting the matching section from `CHANGELOG.md`. +1. **Update the Helm chart** - Create a new branch on [https://github.com/GateNLP/charts](https://github.com/GateNLP/charts) to update the `appVersion` of the `gate-teamware` Helm chart to match the version that was just created by the tag workflow. You must also update the chart `version`, bumping the major version number if the new chart is not backwards-compatible with the old. Submit a pull request to the `main` branch, which will publish the new chart when it is merged. diff --git a/docs/versioned/2.1.1/developerguide/testing.md b/docs/versioned/2.1.1/developerguide/testing.md new file mode 100644 index 00000000..578501b2 --- /dev/null +++ b/docs/versioned/2.1.1/developerguide/testing.md @@ -0,0 +1,182 @@ +# Testing +All the tests can be run using the following command: + +```bash +npm run test +``` + +## Backend Testing +Pytest is used for testing the backend. + +```bash +npm run test:backend +``` + +### Backend test files + +* Unit test files are located in `/backend/tests` + +## Frontend testing +[Jest](https://jestjs.io/) is used for frontend testing. +The [Vue testing-library](https://testing-library.com/docs/vue-testing-library/intro/) is used for testing +Vue components. + +```bash +npm run test:frontend +``` + +### Frontend test files + +* Frontend test files are located in `/fontend/tests/unit` and should the extension `.spec.js` + +### Testing JS functions + +```javascript +describe("Description of a group of tests to be run", () =>{ + + beforeAll(() =>{ + //The code here is run before each test + }) + + it("A single test's description", async () =>{ + + // Assertions are done with the expect() function e.g. + let funcOutput = 30 + 10 + expect(funcOutput).toBe(40) + + + }) +}) + +``` + +### Mocking JS classes + +This is an example of a mock harness for the JRPCClient class. + +A mock file is created inside a ``__mock__`` directory placed next to the file that's being mocked, e.g. +for our JRPCClient class at `/frontend/src/jrpc/index.js`, the mock file is `/frontend/src/jrpc/__mock__/index.js`. + + +Inside the mock file `/frontend/src/jrpc/__mock__/index.js`: +```javascript +// Mocking jrpc/index.js +//Mocking the JRPCClient class +//Replacing the call function with a custom mockCall function +export const mockCall = jest.fn(()=> 30); +const mock = jest.fn().mockImplementation(() => { + return {call: mockCall}; +}); + +export default mock; +``` + + +Inside the test file `*.spec.js`: +```javascript +import JRPCClient from "@/jrpc"; +jest.mock('@/jrpc') + +import store from '@/store' +//Example on how to mock the jrpc call + +describe("Vuex functions testing", () =>{ + + beforeAll(() =>{ + + //Re-implement custom mock call implementation if needed + JRPCClient.mockImplementation(()=>{ + return { + call(){ + return 50 + } + } + }) + + }) + + it("testfunc", async () =>{ + + const noutput = await store.dispatch("testnormal") + expect(noutput).toBe("Hello world") + + const aoutput = await store.dispatch("testasync") + expect(aoutput).toBe("Hello world") + + const rpc = new JRPCClient("/") + const result = await rpc.call("some param") + expect(result).toBe(50) + + }) +}) +``` + +### Testing Vue components + + +```javascript +//Example of how a component could be tested +import { render, fireEvent } from '@testing-library/vue' + + +import HelloWorld from '@/components/HelloWorld.vue' + +//Testing a component e.g. HelloWorld +describe('HelloWorld.vue', () => { + + it('renders props.msg when passed', () => { + const msg = 'new message' + const { getByText } = render(HelloWorld) + + getByText("Installed CLI Plugins") + }) +}) + +``` + + +## Integration testing +[Cypress](https://www.cypress.io/) is used for integration testing. + +The integration settings are located at `teamware/settings/integration.py` + +To run the integration test: +```bash +npm run test:integration +``` + +The test can also be run in **interactive mode** using: + +```bash +npm run serve:cypressintegration +``` + +### Integration test files +Files related to integration testing are located in `/cypress` + +* Test files are located in the `/cypress/integration` directory and should have the extension `.spec.js`. + +### Re-seeding the database + +The command `npm run migrate:integration` resets the database and performs migration, use with `beforeEach` to run it +before every test case in a suite: + +```js +describe('Example test suite', () => { + + beforeEach(() => { + // Resets the database every time before + // the test is run + cy.exec('npm run migrate:integration') + }) + + it('Test case 1', () => { + // Test something + }) + + it('Test case 2', () => { + // Test something + }) +}) +``` + diff --git a/docs/versioned/2.1.1/img/gate-teamware-logo.svg b/docs/versioned/2.1.1/img/gate-teamware-logo.svg new file mode 100644 index 00000000..12385947 --- /dev/null +++ b/docs/versioned/2.1.1/img/gate-teamware-logo.svg @@ -0,0 +1,79 @@ + + + + diff --git a/docs/versioned/2.1.1/manageradminguide/README.md b/docs/versioned/2.1.1/manageradminguide/README.md new file mode 100644 index 00000000..7a70a19f --- /dev/null +++ b/docs/versioned/2.1.1/manageradminguide/README.md @@ -0,0 +1,45 @@ +# GATE Teamware Overview + +## User roles + +There are three types of users in GATE Teamware, [annotators](#annotators), [managers](#managers) +and [admins](#admins). + +### Annotators + +Annotator is the default role when signing up to Teamware. An annotator can be recruited into +annotation projects and annotate documents. + + +### Managers + +Managers can create, view and modify annotation projects. They can also recruit annotators to a project. + +### Admins + +Admins, on top of what managers can do, they can also manage the users in the system and elevate them as +managers or admins. + +## Annotation Projects, Documents and Annotations + +Projects, documents and annotations form the core of the application. + +### Projects + +An annotation project contains a configuration of how annotations are to be captured, the documents and its +annotations and the recruited annotators. + + +### Documents + +A document in application refers to an individual set of arbitrary text that's to be annotated. A document +is stored as arbitrary JSON object and can represent various things such as, a single post (e.g. a tweet +or a post from reddit), a pair of source post and reply or a part of a HTML web page. + + +### Annotations + +An annotation represents a single annotation task against a single document. Like the document, +an annotation is stored as an arbitrary JSON object and can have any arbitrary structure. + + diff --git a/docs/versioned/2.1.1/manageradminguide/annotators_management.md b/docs/versioned/2.1.1/manageradminguide/annotators_management.md new file mode 100644 index 00000000..cb4c79b0 --- /dev/null +++ b/docs/versioned/2.1.1/manageradminguide/annotators_management.md @@ -0,0 +1,13 @@ +# Annotators management + +The **Annotators** tab in the **Project management** page allows the viewing and management of annotators in the project. + +Add annotators to the project by clicking on the list of names in the right column. Current annotators +can be removed by clicking on the names in the left column. Removing annotators does not delete their +completed annotations but will stop their current pending annotation task. + +An annotator can only be recruited into **one project at a time**. + +Once an annotator has annotated a proportion of documents in the project (specified in project configuration), they will +be deemed to have completed all their annotation tasks and automatically be removed the project. This frees them to be +recruited in another project. diff --git a/docs/versioned/2.1.1/manageradminguide/config_examples.js b/docs/versioned/2.1.1/manageradminguide/config_examples.js new file mode 100644 index 00000000..d6e40454 --- /dev/null +++ b/docs/versioned/2.1.1/manageradminguide/config_examples.js @@ -0,0 +1,332 @@ +export default { + config1: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + } + ], + config2: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + }, + { + "name": "opinion", + "type": "text", + "title": "What's your opinion of the above text?", + "optional": true + } + ], + configDisplay: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + } + ], + configDisplayHtmlNoHtml: [ + { + "name": "htmldisplay", + "type": "html", + "text": "No HTML: {{text}}
HTML: {{{text}}}" + } + ], + configDisplayCustomFieldnames: [ + { + "name": "htmldisplay", + "type": "html", + "text": "Custom field: {{customField}}
Another custom field: {{{anotherCustomField}}}
Subfield: {{{subfield.subfieldContent}}}" + } + ], + configDisplayPreserveNewlines: [ + { + "name": "htmldisplay", + "type": "html", + "text": "
{{text}}
" + } + ], + configTextInput: [ + { + "name": "mylabel", + "type": "text", + "optional": true, //Optional - Set if validation is not required + "regex": "regex string", //Optional - When specified, the regex pattern will used to validate the text + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configTextarea: [ + { + "name": "mylabel", + "type": "textarea", + "optional": true, //Optional - Set if validation is not required + "regex": "regex string", //Optional - When specified, the regex pattern will used to validate the text + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configRadio: [ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "orientation": "vertical", //Optional - default is "horizontal" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configRadioHelpText: [ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "orientation": "vertical", //Optional - default is "horizontal" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1", "helptext": "Additional help text for option 1"}, + {"value": "value2", "label": "Text to show user 2", "helptext": "Additional help text for option 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } + ], + configCheckbox: [ + { + "name": "mylabel", + "type": "checkbox", + "optional": true, //Optional - Set if validation is not required + "orientation": "horizontal", //Optional - "horizontal" (default) or "vertical" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "minSelected": 1, //Optional - Specify the minimum number of options that must be selected + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configSelector: [ + { + "name": "mylabel", + "type": "selector", + "optional": true, //Optional - Set if validation is not required + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configRadioDict: [ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "options": { // The options can be specified as a dictionary, ordering is not guaranteed + "value1": "Text to show user 1", + "value2": "Text to show user 2", + "value3": "Text to show user 3", + }, + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + + configDbpediaExample: [ + { + "name": "uri", + "type": "radio", + "title": "Select the most appropriate URI", + "options":[ + {"fromDocument": "candidates"}, + {"value": "none", "label": "None of the above"}, + {"value": "unknown", "label": "Cannot be determined without more context"} + ] + } + ], + docDbpediaExample: { + "text": "President Bush visited the air base yesterday...", + "candidates": [ + { + "value": "http://dbpedia.org/resource/George_W._Bush", + "label": "George W. Bush (Jnr)" + }, + { + "value": "http://dbpedia.org/resource/George_H._W._Bush", + "label": "George H. W. Bush (Snr)" + } + ] + }, + + configConditional1: [ + { + "name": "uri", + "type": "radio", + "title": "Select the most appropriate URI", + "options":[ + {"fromDocument": "candidates"}, + {"value": "other", "label": "Other"} + ] + }, + { + "name": "otherValue", + "type": "text", + "title": "Please specify another value", + "if": "annotation.uri == 'other'", + "regex": "^(https?|urn):", + "valError": "Please specify a URI (starting http:, https: or urn:)" + } + ], + configConditional2: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + }, + { + "name": "reason", + "type": "text", + "title": "Why do you disagree with the suggested value?", + "if": "annotation.sentiment !== document.preanno.sentiment" + } + ], + docsConditional2: [ + { + "text": "I love the thing!", + "preanno": { + "sentiment": "positive" + } + }, + { + "text": "I hate the thing!", + "preanno": { + "sentiment": "negative" + } + }, + { + "text": "The thing is ok, I guess...", + "preanno": { + "sentiment": "neutral" + } + } + ], + + + doc1: {text: "Sometext with html"}, + doc2: { + customField: "Content of custom field.", + anotherCustomField: "Content of another custom field.", + subfield: { + subfieldContent: "Content of a subfield." + } + }, + docPlainText: { + "text": "This is some text\n\nIt has line breaks that we want to preserve." + }, + configPreAnnotation: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "radio", + "type": "radio", + "title": "Test radio input", + "options": [ + {"value": "val1", "label": "Value 1"}, + {"value": "val2", "label": "Value 2"}, + {"value": "val3", "label": "Value 4"}, + {"value": "val4", "label": "Value 5"} + ], + "description": "Test radio description" + }, + { + "name": "checkbox", + "type": "checkbox", + "title": "Test checkbox input", + "options": [ + {"value": "val1", "label": "Value 1"}, + {"value": "val2", "label": "Value 2"}, + {"value": "val3", "label": "Value 4"}, + {"value": "val4", "label": "Value 5"} + ], + "description": "Test checkbox description" + }, + { + "name": "text", + "type": "text", + "title": "Test text input", + "description": "Test text description" + } + + ], + docPreAnnotation: { + "id": 12345, + "text": "Example document text", + "preannotation": { + "radio": "val1", + "checkbox": ["val1", "val3"], + "text": "Pre-annotation text value" + } + } + + +} diff --git a/docs/versioned/2.1.1/manageradminguide/documents_annotations_management.md b/docs/versioned/2.1.1/manageradminguide/documents_annotations_management.md new file mode 100644 index 00000000..ac010c9f --- /dev/null +++ b/docs/versioned/2.1.1/manageradminguide/documents_annotations_management.md @@ -0,0 +1,272 @@ +# Documents & Annotations + +The **Documents & Annotations** tab in the **Project management** page allows the viewing and management of documents +and annotations related to the project. + +## Document & Annotation status + +### Annotation status + +Annotations can be in 1 of 5 states: + +* Annotation is completed - The annotator has completed this annotation task. +* Annotation is rejected - The annotator has chosen to not annotate the document. +* Annotation is timed out - The annotation task was not completed within the time specified in the project's configuration. The task is freed and can be assigned to another annotator. +* Annotation is aborted - The annotation task was aborted due to reasons other than timing out, such as when an annotator with a pending task is removed from a project. +* Annotation is pending - The annotator has started the annotation task but has not completed it. + +### Document status + +Documents also display a list of its current annotation status: + +* 1 - Number of completed annotations in the document. +* 1 - Number of rejected annotations in the document. +* 1 - Number of timed out annotations in the document. +* 1 - Number of aborted annotations in the document. +* 1 - Number of pending annotations in the document. + +## Importing documents + +Documents can be imported using the **Import** button. The supported file types are: + +* `.json` - The app expects a list of documents (represented as a dictionary object) + e.g. `[{"id": 1, "text": "Text1"}, ...]`. +* `.jsonl` - The app expects one document (represented as a dictionary object) per line. +* `.csv` - File must have a header row. It will be internally converted to JSON format. +* `.zip` - Can contain any number of `.json,.jsonl and .csv` files inside. + +### Importing documents with pre-annotation + +In the `Project Configurations` page, it is possible to set a field in which Teamware will look for pre-annotation. If +the field is found inside the document then the annotation form will be pre-filled with data provided in the document. + +The format for pre-annotation is exactly the same as the annotation output. You can see an example of generated +annotation by filling out the form in the `Annotation Preview` and observing the values in +the `Annotation Output Preview`. + + +For an example project configuration shown below, there are three captured labels named `radio`, `checkbox` and `text`: + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "radio", + "type": "radio", + "title": "Test radio input", + "options": [ + {"value": "val1", "label": "Value 1"}, + {"value": "val2", "label": "Value 2"}, + {"value": "val3", "label": "Value 4"}, + {"value": "val4", "label": "Value 5"} + ], + "description": "Test radio description" + }, + { + "name": "checkbox", + "type": "checkbox", + "title": "Test checkbox input", + "options": [ + {"value": "val1", "label": "Value 1"}, + {"value": "val2", "label": "Value 2"}, + {"value": "val3", "label": "Value 4"}, + {"value": "val4", "label": "Value 5"} + ], + "description": "Test checkbox description" + }, + { + "name": "text", + "type": "text", + "title": "Test text input", + "description": "Test text description" + } +] +``` + +On the `Project Configuration` page, if the `Pre-annotation` field is set to `preannotation`, the annotation form will be pre-filled with +the content provided in the `preannotation` field of the document e.g.: + +```json +{ + "id": 12345, + "text": "Example document text", + "preannotation": { + "radio": "val1", + "checkbox": [ + "val1", + "val3" + ], + "text": "Pre-annotation text value" + } +} +``` + +The example of the pre-filled form can be seen by clicking on the `Preview` tab above. + + + + + + +### Importing Training and Test documents + +When importing documents for the training and testing phase, Teamware expects a field/column (called `gold` by default) +that contains the correct annotation response for each label and, only for training documents, an explanation. + +For example, if we're expecting a multi-choice label for doing sentiment classification with a widget named `sentiment` +and choice of `postive`, `negative` and `neutrual`: + +```js +[ + { + "text": "What's my sentiment", + "gold": { + "sentiment": { + "value": "positive", // For this document, the correct value is postive + "explanation": "Because..." // Explanation is only given in the traiing phase and are optional in the test documents + } + } + } +] +``` + +in csv: + +| text | gold.sentiment.value | gold.sentiment.explanation | +| --- | --- | --- | +| What's my sentiment | positive | Because... | + +### Guidance on CSV column headings + +It is recommended that: + +* Spaces are not used in column headings, use dash (`-`), underscore (`_`) or camel case (e.g. fieldName) instead. +* The dot/full stop (`.`) is used to indicate hierarchical information so don't use it if that's not what's intended. + Explanation on this feature is given below. + +Documents imported from a CSV files are converted to JSON for use internally in Teamware, the reverse is true when +converting back to CSV. To allow a CSV to represent a hierarchical structure, a dot notation is used to indicate a +sub-field. + +In the following example, we can see that `gold` has a child field named `sentiment` which then has a child field +named `value`: + +| text | gold.sentiment.value | gold.sentiment.explanation | +| --- | --- | --- | +| What's my sentiment | positive | Because... | + +The above column headers will generate the following JSON: + +```js +[ + { + "text": "What's my sentiment", + "gold": { + "sentiment": { + "value": "positive", // For this document, the correct value is postive + "explanation": "Because..." // Explanation is only given in the traiing phase and are optional in the test documents + } + } + } +] +``` + +## Exporting documents + +Documents and annotations can be exported using the **Export** button. A zip file is generated containing files with 500 +documents each. You can choose how documents are exported: + +* `.json` & `.jsonl` - JSON or JSON Lines files can be generated in the format of: + * `raw` - Exports unmodified JSON. If you've originally uploaded in GATE format then choose this option. + + An additional field named `annotation_sets` is added for storing annotations. The annotations are laid out in the + same way as GATE JSON format. For example if a document has been annotated by `user1` with labels and values + `text`:`Annotation text`, `radio`:`val3`, and `checkbox`:`["val2", "val4"]`: + + ```json + { + "id": 32, + "text": "Document text", + "text2": "Document text 2", + "feature1": "Feature text", + "annotation_sets":{ + "user1":{ + "name":"user1", + "annotations":[ + { + "type":"Document", + "start":0, + "end":10, + "id":0, + "features":{ + "label":{ + "text":"Annotation text", + "radio":"val3", + "checkbox":[ + "val2", + "val4" + ] + } + } + } + ], + "next_annid":1 + } + } + } + ``` + + * `gate` - Convert documents to GATE JSON format and export. A `name` field is added that takes the ID value from the + ID field specified in the project configuration. Fields apart from `text` and the ID field specified in the project + config are placed in the `features` field. An `annotation_sets` field is added for storing annotations. + + For example in the case of this uploaded JSON document: + ```json + { + "id": 32, + "text": "Document text", + "text2": "Document text 2", + "feature1": "Feature text" + } + ``` + The generated output is as follows. The annotations are formatted same as the `raw` output above: + ```json + { + "name": 32, + "text": "Document text", + "features": { + "text2": "Document text 2", + "feature1": "Feature text" + }, + "offset_type":"p", + "annotation_sets": {...} + } + ``` +* `.csv` - The JSON documents will be flattened to csv's column based format. Annotations are added as additional + columns with the header of `annotations.username.label`. + +## Deleting documents and annotations + +It is possible to click on the top left of corner of documents and annotations to select it, then click on the +**Delete** button to delete them. + +::: tip + +Selecting a document also selects all its associated annotations. + +::: + + + diff --git a/docs/versioned/2.1.1/manageradminguide/project_config.md b/docs/versioned/2.1.1/manageradminguide/project_config.md new file mode 100644 index 00000000..d3d4ea3d --- /dev/null +++ b/docs/versioned/2.1.1/manageradminguide/project_config.md @@ -0,0 +1,684 @@ +--- +sidebarDepth: 3 +--- + +# Project configuration + +The **Configuration** tab in the **Project management** page allows you to change project settings including what +annotations are captured. + +Project configurations can be imported and exported in the format of a JSON file. + +The project can be also be cloned (have configurations copied to a new project). Note that cloning does not copy +documents, annotations or annotators to the new project. + +## Configuration fields + +* **Name** - The name of this annotation project. +* **Description** - The description of this annotation project that will be shown to annotators. Supports markdown and + HTML. +* **Annotator guideline** - The description of this annotation project that will be shown to annotators. Supports + markdown and HTML. +* **Annotations per document** - The project completes when each document in this annotation project have this many + number of valid annotations. When a project completes, all project annotators will be un-recruited and be allowed to + annotate other projects. +* **Maximum proportion of documents annotated per annotator (between 0 and 1)** - A single annotator cannot annotate + more than this proportion of documents. +* **Timeout for pending annotation tasks (minutes)** - Specify the number of minutes a user has to complete an + annotation task (i.e. annotating a single document). +* **Reject documents** - Switching this off will mean that annotators for this project will be unable to choose to reject documents. +* **Document ID field** - The field in your uploaded documents that is used as a unique identifier. GATE's json format + uses the name field. You can use a dot limited key path to access subfields e.g. enter features.name to get the id + from the object `{'features':{'name':'nameValue'}}` +* **Training stage enable/disable** - Enable or disable training stage, allows testing documents to be uploaded to the project. +* **Test stage enable/disable** - Enable or disable testing stage, allows test documents to be uploaded to the project. +* **Auto elevate to annotator** - The option works in combination with the training and test stage options, see table below for the behaviour: + + | Training stage | Testing stage | Auto elevate to annotator | Desciption | + | --- | --- | --- | --- | + | Disabled | Disabled | Enabled/Disabled | User allowed to annotate without manual approval. | + | Enabled | Disabled | Disabled | Manual approval required. | + | Disabled | Enabled | Disabled | " | + | Enabled | Disabled | Enabled | User always allowed to annotate after training phase completed | + | Disabled | Enabled | Enabled | User automatically allowed to annotate after passing test, if user fails test they have to be manually approved. | + | Enabled | Enabled | Enabled | " | + +* **Test pass proportion** - The proportion of correct test annotations to be automatically allowed to annotate documents. +* **Gold standard field** - The field in document's JSON/column that contains the ideal annotation values and explanation for the annotation. +* **Pre-annotation** - Pre-fill the form with annotation provided in the specified field. See [Importing Documents with pre-annotation](./documents_annotations_management.md#importing-documents-with-pre-annotation) section for more detail. + +## Annotation configuration + +The annotation configuration takes a `json` string for configuring how the document is displayed to the user and types +of annotation will be collected. Here's an example configuration and a preview of how it is shown to annotators: + + + + +```json +// Example configuration +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + } +] +``` + + + +Within the configuration, it is possible to specify how your documents will be displayed. The **Document input preview** +box can be used to provide a sample of your document for rendering of the preview. + +```json +// Example contents for the Document input preview +{ + "text": "Sometext with html" +} +``` + + + +The above configuration displays the value from the `text` field from the document to be annotated. It then shows a set +of 3 radio inputs that allows the user to select a Negative, Neutral, or Positive sentiment with the label +name `sentiment`. + + + +All fields **require** the properties **name** and **type**, it is used to name our label and determine the type of +input/display to be shown to the user respectively. + +Another field can be added to collect more information, e.g. a text field for opinions: + + + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + }, + { + "name": "opinion", + "type": "text", + "title": "What's your opinion of the above text?", + "optional": true + } + +] +``` + + + +Note that for the above case, the `optional` field is added ensure that allows user to not have to input any value. +This `optional` field can be used on all components. Any component may optionally have a field named `if`, containing an expression that is used to determine whether or not the component appears based on information in the document and/or the values entered in the other components. For example the user could be presented with a set of options that includes an "other" choice, and if the annotator chooses "other" then an additional free text field appears for them to fill in. The `if` option is described in more detail under the [conditional components](#conditional-components) section below. + +Some fields are available to configure which are specific to components, e.g. the `options` field are only available for +the `radio`, `checkbox` and `selector` components. See details below on the usage of each specific component. + +The captured annotation results in a JSON dictionary, an example can be seen in the **Annotation output preview** box. +The annotation is linked to a Document and is converted to a GATE JSON annotation format when exported. + +### Displaying text + + + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" // The text that will be displayed + } +] +``` + + + +The `htmldisplay` widget allows you to display the text you want annotated. It accepts almost full range of HTML +input which gives full styling flexibility. + +Any field/column from the document can be inserted by surrounding a field/column name with double or +triple curly brackets. Double curly brackets renders text as-is and triple curly brackets accepts HTML string: + + + +Input: + +```json +{ + "text": "Sometext with html" +} +``` + +Configuration, showing the same field/column in document as-is or as HTML: +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "No HTML: {{text}}
HTML: {{{text}}}" + } +] +``` + +
+ +The widget makes no assumption about your document structure and any field/column names can be used, +even sub-fields by using the dot notation e.g. `parentField.childField`: + + + +JSON input: + +```json +{ + "customField": "Content of custom field.", + "anotherCustomField": "Content of another custom field.", + "subfield": { + "subfieldContent": "Content of a subfield." + } +} +``` + +or in csv + +| customField | anotherCustomField | subfield.subfieldContent | +| --- | --- | --- | +| Content of custom field. | Content of another custom field. | Content of a subfield. | + + +Configuration, showing the same field/column in document as-is or as HTML: +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "Custom field: {{customField}}
Another custom field: {{{anotherCustomField}}}
Subfield: {{{subfield.subfieldContent}}}" + } +] +``` + +
+ +If your documents are plain text and include line breaks that need to be preserved when rendering, this can be achieved by using a special HTML wrapper which sets the [`white-space` CSS property](https://developer.mozilla.org/en-US/docs/Web/CSS/white-space). + + + +**Document** + +```json +{ + "text": "This is some text\n\nIt has line breaks that we want to preserve." +} +``` + +**Project configuration** + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "
{{text}}
" + } +] +``` + +
+ +`white-space: pre-line` preserves line breaks but collapses other whitespace down to a single space, `white-space: pre-wrap` would preserve all whitespace including indentation at the start of a line, but would still wrap lines that are too long for the available space. + +### Text input + + + +```json +[ + { + "name": "mylabel", + "type": "text", + "optional": true, //Optional - Set if validation is not required + "regex": "regex string", //Optional - When specified, the regex pattern will used to validate the text + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Textarea input + + + +```json +[ + { + "name": "mylabel", + "type": "textarea", + "optional": true, //Optional - Set if validation is not required + "regex": "regex string", //Optional - When specified, the regex pattern will used to validate the text + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Radio input + + + +```json +[ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "orientation": "vertical", //Optional - default is "horizontal" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Checkbox input + + + +```json +[ + { + "name": "mylabel", + "type": "checkbox", + "optional": true, //Optional - Set if validation is not required + "orientation": "horizontal", //Optional - "horizontal" (default) or "vertical" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "minSelected": 1, //Optional - Overrides optional field. Specify the minimum number of options that must be selected + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Selector input + + + +```json +[ + { + "name": "mylabel", + "type": "selector", + "optional": true, //Optional - Set if validation is not required + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Optional help text + +Optionally, radio buttons and checkboxes can be given help text to provide additional per-choice context or information to help annotators. + + + + +```json +[ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "orientation": "vertical", //Optional - default is "horizontal" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1", "helptext": "Additional help text for option 1"}, + {"value": "value2", "label": "Text to show user 2", "helptext": "Additional help text for option 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Alternative way to provide options for radio, checkbox and selector + +A dictionary (key value pairs) and also be provided to the `options` field of the radio, checkbox and selector widgets +but note that the ordering of the options are **not guaranteed** as javascript does not sort dictionaries by +the order in which keys are added. Note that additional help texts for radio buttons and checkboxes are not supported using this syntax. + + + +```json +[ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "options": { // The options can be specified as a dictionary, ordering is not guaranteed + "value1": "Text to show user 1", + "value2": "Text to show user 2", + "value3": "Text to show user 3" + }, + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Dynamic options for radio, checkbox and selector + +All the examples above have a "static" list of available options for the radio, checkbox and selector widgets, where the complete options list is enumerated in the project configuration and every document offers the same set of options. However it is also possible to take some or all of the options from the _document_ data rather than the _configuration_ data. For example: + + + +**Project configuration** + +```json +[ + { + "name": "uri", + "type": "radio", + "title": "Select the most appropriate URI", + "options":[ + {"fromDocument": "candidates"}, + {"value": "none", "label": "None of the above"}, + {"value": "unknown", "label": "Cannot be determined without more context"} + ] + } +] +``` + +**Document** + +```json +{ + "text": "President Bush visited the air base yesterday...", + "candidates": [ + { + "value": "http://dbpedia.org/resource/George_W._Bush", + "label": "George W. Bush (Jnr)" + }, + { + "value": "http://dbpedia.org/resource/George_H._W._Bush", + "label": "George H. W. Bush (Snr)" + } + ] +} +``` + + + +`"fromDocument"` is a dot-separated property path leading to the location within each document where the additional options can be found, for example `"fromDocument":"candidates"` looks for a top-level property named `candidates` in each document, `"fromDocument": "options.custom"` would look for a property named `options` which is itself an object with a property named `custom`. The target property in the document may be in any of the following forms: + +- an array _of objects_, each with `value` and `label` (and optionally `helptext`) properties, exactly as in the static configuration format - this is the format used in the example above +- an array _of strings_, where the same string will be used as both the value and the label for that option +- an arbitrary ["dictionary"](#options-as-dict) object mapping values to labels +- a _single string_, which is parsed into a list of options + +The "single string" alternative is designed to be easier to use when [importing documents](documents_annotations_management.md#importing-documents) from CSV files. It allows you to provide any number of options in a _single_ CSV column value. Within the column the options are separated by semicolons, and each option is of the form `value=label`. Whitespace around the delimiters is ignored, both between options and between the value and label of a single option. For example given CSV document data of + +| text | options | +|-----------------|---------------------------------------------------| +| Favourite fruit | `apple=Apples; orange = Oranges; kiwi=Kiwi fruit` | + +a `{"fromDocument": "options"}` configuration would produce the equivalent of + +```json +[ + {"value": "apple", "label": "Apples"}, + {"value": "orange", "label": "Oranges"}, + {"value": "kiwi", "label": "Kiwi fruit"} +] +``` + +If your values or labels may need to contain the default separator characters `;` or `=` you can select different separators by adding extra properties to the configuration: + +```json +{"fromDocument": "options", "separator": "~~", "valueLabelSeparator": "::"} +``` + +| text | options | +|-----------------|------------------------------------------------------| +| Favourite fruit | `apple::Apples ~~ orange::Oranges ~~ kiwi::Kiwi fruit` | + +The separators can be more than one character, and you can set `"valueLabelSeparator":""` to disable label splitting altogether and just use the value as its own label. + +### Mixing static and dynamic options + +Static and `fromDocument` options may be freely interspersed in any order, so you can have a fully-dynamic set of options by specifying _only_ a `fromDocument` entry with no static options, or you can have static options that are listed first followed by dynamic options, or dynamic options first followed by static, etc. + +### Conditional components + +By default all components listed in the project configuration will be shown for all documents. However this is not always appropriate, for example you may have some components that are only relevant to certain documents, or only relevant for particular combinations of values in _other_ components. To allow for these kinds of scenarios any component can have a field named `if` specifying the conditions under which that component should be shown. + +The `if` field is an _expression_ that is able to refer to fields in both the current _document_ being annotated and the current state of the other annotation components. The expression language is largely based on a subset of the standard JavaScript expression syntax but with a few additional syntax elements to ease working with array data and regular expressions. + +The following simple example shows how you might implement an "Other (please specify)" pattern, where the user can select from a list of choices but also has the option to supply their own answer if none of the choices are appropriate. The free text field is only shown if the user selects the "other" choice. + + + +**Project configuration** + +```json +[ + { + "name": "uri", + "type": "radio", + "title": "Select the most appropriate URI", + "options":[ + {"fromDocument": "candidates"}, + {"value": "other", "label": "Other"} + ] + }, + { + "name": "otherValue", + "type": "text", + "title": "Please specify another value", + "if": "annotation.uri == 'other'", + "regex": "^(https?|urn):", + "valError": "Please specify a URI (starting http:, https: or urn:)" + } +] +``` + +**Document** + +```json +{ + "text": "President Bush visited the air base yesterday...", + "candidates": [ + { + "value": "http://dbpedia.org/resource/George_W._Bush", + "label": "George W. Bush (Jnr)" + }, + { + "value": "http://dbpedia.org/resource/George_H._W._Bush", + "label": "George H. W. Bush (Snr)" + } + ] +} +``` + + +Note that validation rules (such as `optional`, `minSelected` or `regex`) are not applied to components that are hidden by an `if` expression - hidden components will never be included in the annotation output, even if they would be considered "required" had they been visible. + +Components can also be made conditional on properties of the _document_, or a combination of the document and the annotation values, for example + + + +**Project configuration** + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + }, + { + "name": "reason", + "type": "text", + "title": "Why do you disagree with the suggested value?", + "if": "annotation.sentiment !== document.preanno.sentiment" + } +] +``` + +**Documents** + +```json +[ + { + "text": "I love the thing!", + "preanno": { "sentiment": "positive" } + }, + { + "text": "I hate the thing!", + "preanno": { "sentiment": "negative" } + }, + { + "text": "The thing is ok, I guess...", + "preanno": { "sentiment": "neutral" } + } +] +``` + + + +The full list of supported constructions is as follows: + +- the `annotation` variable refers to the current state of the annotation components for this document + - the current value of a particular component can be accessed as `annotation.componentName` or `annotation['component name']` - the brackets version will always work, the dot version works if the component's `name` is a valid JavaScript identifier + - if a component has not been set since the form was last cleared the value may be `null` or `undefined` - the expression should be written to cope with both + - the value of a `text`, `textarea`, `radio` or `selector` component will be a single string (or null/undefined), the value of a `checkbox` component will be an _array_ of strings since more than one value may be selected. If no value is selected the array may be null, undefined or empty, the expression must be prepared to handle any of these +- the `document` variable refers to the current document that is being annotated + - again properties of the document can be accessed as `document.propertyName` or `document['property name']` + - continue the same pattern for nested properties e.g. `document.scores.label1` + - individual elements of array properties can be accessed by zero-based index (e.g. `document.options[0]`) +- various comparison operators are available: + - `==` and `!=` (equal and not-equal) + - `<`, `<=`, `>=`, `>` (less-than, less-or-equal, greater-or-equal, greater-than) + - these operators follow JavaScript rules, which are not always intuitive. Generally if both arguments are strings then they will be compared by lexicographic order, but if either argument is a number then the other one will also be converted to a number before comparing. So if the `score` component is set to the value "10" (a string of two digits) then `annotation.score < 5` would be _false_ (10 is converted to number and compared to 5) but `annotation.score < '5'` would be _true_ (the string "10" sorts before the string "5") + - `in` checks for the presence of an item in an array or a key in an object + - e.g. `'other' in annotation.someCheckbox` checks if the `other` option has been ticked in a checkbox component (whose value is an array) + - this is different from normal JavaScript rules, where `i in myArray` checks for the presence of an array _index_ rather than an array _item_ +- other operators + - `+` (concatenate strings, or add numbers) + - if either argument is a string then both sides are converted to strings and concatenated together + - otherwise both sides are treated as numbers and added + - `-`, `*`, `/`, `%` (subtraction, multiplication, division and remainder) + - `&&`, `||` (boolean AND and OR) + - `!` (prefix boolean NOT, e.g. `!annotation.selected` is true if `selected` is false/null/undefined and false otherwise) + - conditional operator `expr ? valueIfTrue : valueIfFalse` (exactly as in JavaScript, first evaluates the test `expr`, then either the `valueIfTrue` or `valueIfFalse` depending on the outcome of the test) +- `value =~ /regex/` tests whether the given string value contains any matches for the given [regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_expressions#writing_a_regular_expression_pattern) + - use `^` and/or `$` to anchor the match to the start and/or end of the value, for example `annotation.example =~ /^a/i` checks whether the `example` annotation value _starts with_ "a" or "A" (the `/i` flag makes the expression case-insensitive) + - since the project configuration is entered as JSON, any backslash characters within the regex must be doubled to escape them from the JSON parser, i.e. `"if": "annotation.option =~ /\\s/"` would check if `option` contains any space characters (for which the regular expression literal is `/\s/`) +- _Quantifier_ expressions let you check whether `any` or `all` of the items in an array or key/value pairs in an object match a predicate expression. The general form is `any(x in expr, predicate)` or `all(x in expr, predicate)`, where `expr` is an expression that resolves to an array or object value, `x` is a new identifier, and `predicate` is the expression to test each item against. The `predicate` expression can refer to the `x` identifier + - `any(option in annotation.someCheckbox, option > 3)` + - `all(e in document.scores, e.value < 0.7)` (assuming `scores` is an object mapping labels to scores, e.g. `{"scores": {"positive": 0.5, "negative": 0.3}}`) + - when testing a predicate against an _object_ each entry has `.key` and `.value` properties giving the key and value of the current entry + - on a null, undefined or empty array/object, `any` will return _false_ (since there are no items that pass the test) and `all` will return _true_ (since there are no items that _fail_ the test) + - the predicate is optional - `any(arrayExpression)` resolves to `true` if any item in the array has a value that JavaScript considers to be "truthy", i.e. anything other than the number 0, the empty string, null or undefined. So `any(annotation.myCheckbox)` is a convenient way to check whether _at least one_ option has been selected in a `checkbox` component. + +If the `if` expression for a particular component is _syntactically invalid_ (missing operands, mis-matched brackets, etc.) then the condition will be ignored and the component will always be displayed as though it did not have an `if` expression at all. Conversely, if the expression is valid but an error occurs while _evaluating_ it, this will be treated the same as if the expression returned `false`, and the associated component will not be displayed. The behaviour is this way around as the most common reason for errors during evaluation is attempting to refer to annotation components that have not yet been filled in - if this is not appropriate in your use case you must account for the possibility within your expression. For example, suppose `confidence` is a `radio` or `selector` component with values ranging from 1 to 5, then another component that declares + +``` +"if": "annotation.confidence && annotation.confidence < 4"` +``` + +will hide this component if `confidence` is unset, displaying it only if `confidence` is set to a value less than 4, whereas + +``` +"if": "!annotation.confidence || annotation.confidence < 4" +``` + +will hide this component only if `confidence` is actually _set_ to a value of 4 or greater - it will _show_ this component if `confidence` is unset. Either approach may be correct depending on your project's requirements. + +To assist managers in authoring project configurations with `if` conditions, the "preview" mode on the project configuration page will display details of any errors that occur when parsing the expressions, or when evaluating them against the **Document input preview** data. You are encouraged to test your expressions thoroughly against a variety of inputs to ensure they behave as intended, before opening your project to annotators. + + diff --git a/docs/versioned/2.1.1/manageradminguide/project_management.md b/docs/versioned/2.1.1/manageradminguide/project_management.md new file mode 100644 index 00000000..f1fd0cfe --- /dev/null +++ b/docs/versioned/2.1.1/manageradminguide/project_management.md @@ -0,0 +1,38 @@ +# Annotation Project Management + +## Project Listing +Clicking on the `Projects` link in the top navigation bar takes you to a contains a list of existing +projects. The project names are shown along with their summaries. Clicking on a project name will +take you to the project management page. + + +## Project Management Page + +The project management page contains all the functionalities to manage an annotation project. The page +is composed of three main tabs: + +* [Configuration](project_config.md) - Configure project settings including what annotations are captured. +* [Documents & Annotation](documents_annotations_management.md) - Manage documents and annotations. Upload documents, see contents of a document's annotations and import/export documents. +* [Annotators](annotators_management.md) - Manage the recruitment of annotators. + +::: warning + +Annotators can only be recruited to an annotation project after it has been configured and documents +are uploaded to the project. + +::: + + +## Project status icons +In the **Project listing** and **Project management page**, icon badges are used to provide a quick overview of the project's status: + +* 1 - Number of completed annotations in the project. +* 1 - Number of rejected annotations in the project. +* 1 - Number of timed out annotations in the project. +* 1 - Number of aborted annotations in the project. +* 1 - Number of pending annotations in the project. +* 2/60 - Number of occupied annotation tasks over number of total tasks in the project. +* 20/5/10 - Number of documents, training documents and test documents in the project. +* 1 - Number of annotators recruited in the project. Annotators are removed from the project when they have completed all annotation tasks in their quota. + + diff --git a/docs/versioned/2.2.0/.vuepress/components/AnnotationRendererPreview.vue b/docs/versioned/2.2.0/.vuepress/components/AnnotationRendererPreview.vue new file mode 100644 index 00000000..98ecc1af --- /dev/null +++ b/docs/versioned/2.2.0/.vuepress/components/AnnotationRendererPreview.vue @@ -0,0 +1,107 @@ + + + + + diff --git a/docs/versioned/2.2.0/.vuepress/components/DisplayVersion.vue b/docs/versioned/2.2.0/.vuepress/components/DisplayVersion.vue new file mode 100644 index 00000000..03ec07ed --- /dev/null +++ b/docs/versioned/2.2.0/.vuepress/components/DisplayVersion.vue @@ -0,0 +1,21 @@ + + + + + diff --git a/docs/versioned/2.2.0/.vuepress/config.js b/docs/versioned/2.2.0/.vuepress/config.js new file mode 100644 index 00000000..b78cc868 --- /dev/null +++ b/docs/versioned/2.2.0/.vuepress/config.js @@ -0,0 +1,42 @@ +const versionData = require("./versions.json") +const path = require("path"); +module.exports = context => ({ + title: 'GATE Teamware Documentation', + description: 'Documentation for GATE Teamware', + base: versionData.base, + themeConfig: { + nav: [ + {text: 'Home', link: '/'}, + {text: 'Annotators', link: '/annotatorguide/'}, + {text: 'Managers & Admins', link: '/manageradminguide/'}, + {text: 'Developer', link: '/developerguide/'} + ], + sidebar: { + '/manageradminguide/': [ + "", + "project_management", + "project_config", + "documents_annotations_management", + "annotators_management" + ], + '/developerguide/': [ + '', + 'frontend', + 'testing', + 'releases', + 'documentation', + "api_docs", + + ], + }, + }, + configureWebpack: { + resolve: { + alias: { + '@': path.resolve(__dirname, versionData.frontendSource) + } + } + }, + + +}) diff --git a/docs/versioned/2.2.0/.vuepress/enhanceApp.js b/docs/versioned/2.2.0/.vuepress/enhanceApp.js new file mode 100644 index 00000000..e7aadadf --- /dev/null +++ b/docs/versioned/2.2.0/.vuepress/enhanceApp.js @@ -0,0 +1,17 @@ +import Vue from 'vue' +import {BootstrapVue, BootstrapVueIcons, IconsPlugin} from 'bootstrap-vue' + +import 'bootstrap/dist/css/bootstrap.css' +import 'bootstrap-vue/dist/bootstrap-vue.css' + +Vue.use(BootstrapVue) +Vue.use(BootstrapVueIcons) + +export default ({ + Vue, // the version of Vue being used in the VuePress app + options, // the options for the root Vue instance + router, // the router instance for the app + siteData // site metadata +}) => { + +} diff --git a/docs/versioned/2.2.0/.vuepress/theme/components/Navbar.vue b/docs/versioned/2.2.0/.vuepress/theme/components/Navbar.vue new file mode 100644 index 00000000..c3b966db --- /dev/null +++ b/docs/versioned/2.2.0/.vuepress/theme/components/Navbar.vue @@ -0,0 +1,143 @@ + + + + + diff --git a/docs/versioned/2.2.0/.vuepress/theme/components/VersionSelector.vue b/docs/versioned/2.2.0/.vuepress/theme/components/VersionSelector.vue new file mode 100644 index 00000000..4cfb5eb9 --- /dev/null +++ b/docs/versioned/2.2.0/.vuepress/theme/components/VersionSelector.vue @@ -0,0 +1,33 @@ + + + + + diff --git a/docs/versioned/2.2.0/.vuepress/theme/index.js b/docs/versioned/2.2.0/.vuepress/theme/index.js new file mode 100644 index 00000000..b91b8a57 --- /dev/null +++ b/docs/versioned/2.2.0/.vuepress/theme/index.js @@ -0,0 +1,3 @@ +module.exports = { + extend: '@vuepress/theme-default' +} diff --git a/docs/versioned/2.2.0/.vuepress/versions.json b/docs/versioned/2.2.0/.vuepress/versions.json new file mode 100644 index 00000000..21ad6d41 --- /dev/null +++ b/docs/versioned/2.2.0/.vuepress/versions.json @@ -0,0 +1,35 @@ +{ + "current": "2.2.0", + "base": "/gate-teamware/2.2.0/", + "versions": [ + { + "text": "0.3.0", + "value": "/gate-teamware/0.3.0/" + }, + { + "text": "0.4.0", + "value": "/gate-teamware/0.4.0/" + }, + { + "text": "2.0.0", + "value": "/gate-teamware/2.0.0/" + }, + { + "text": "2.1.0", + "value": "/gate-teamware/2.1.0/" + }, + { + "text": "2.1.1", + "value": "/gate-teamware/2.1.1/" + }, + { + "text": "2.2.0", + "value": "/gate-teamware/2.2.0/" + }, + { + "text": "development", + "value": "/gate-teamware/development/" + } + ], + "frontendSource": "../../../../frontend/src" +} \ No newline at end of file diff --git a/docs/versioned/2.2.0/README.md b/docs/versioned/2.2.0/README.md new file mode 100644 index 00000000..d12b9b88 --- /dev/null +++ b/docs/versioned/2.2.0/README.md @@ -0,0 +1,138 @@ +# GATE Teamware + +![GATE Teamware logo](./img/gate-teamware-logo.svg "GATE Teamware logo") + +A web application for collaborative document annotation. + +This is a documentation for Teamware version: + +## Key Features +* Free and open source software. +* Configure annotation options using a highly flexible JSON config. +* Set limits on proportions of a task that annotators can annotate. +* Import existing annotations as CSV or JSON. +* Export annotations as CSV or JSON. +* Annotation instructions and document rendering supports markdown and HTML. + +## Getting started +A quickstart guide for annotators is [available here](annotatorguide). + +To use an existing instance of GATE Teamware as a project manager or admin, find instructions in the [Managers and Admins guide](manageradminguide). + +Documentation on deploying your own instance can be found in the [Developer Guide](developerguide). + +## Installation Guide + +### Quick Start + +The simplest way to deploy your own copy of GATE Teamware is to use Docker Compose on Linux or Mac. Installation on Windows is possible but not officially supported - you need to be able to run `bash` shell scripts for the quick-start installer. + +1. Install Docker - [Docker Engine](https://docs.docker.com/engine/) for Linux servers or [Docker Desktop](https://docs.docker.com/desktop/) for Mac. +2. Install [Docker Compose](https://github.com/docker/compose), if your Docker does not already include it (Compose is included by default with Docker Desktop) +3. Download the [installation script](https://gate.ac.uk/get-teamware.sh) into an empty directory, run it and follow the instructions. + +``` +mkdir gate-teamware +cd gate-teamware +curl -LO https://gate.ac.uk/get-teamware.sh +bash ./get-teamware.sh +``` + +This will make the Teamware application available as `http://localhost:8076`, with the option to expose it as a public `https://` URL if your server is directly internet-accessible - for production use we recommend deploying Teamware with a suitable internet-facing reverse proxy, or use Kubernetes as described below. + +### Deployment using Kubernetes + +A Helm chart to deploy Teamware on Kubernetes is published to the GATE team public charts repository. The chart requires [Helm](https://helm.sh) version 3.7 or later, and is compatible with Kubernetes version 1.23 or later. Earlier Kubernetes versions back to 1.19 _may_ work provided autoscaling is not enabled, but these have not been tested. + +The following quick start instructions assume you have a compatible Kubernetes cluster and a working installation of `kubectl` and `helm` (3.7 or later) with permission to create all the necessary resource types in your target namespace. + +First generate a random "secret key" for the Django application. This must be at least 50 random characters, a quick way to do this is + +``` +# 42 random bytes base64 encoded becomes 56 random characters +kubectl create secret generic -n {namespace} django-secret \ + --from-literal="secret-key=$( openssl rand -base64 42 )" +``` + +Add the GATE charts repository to your Helm configuration: + +``` +helm repo add gate https://repo.gate.ac.uk/repository/charts +helm repo update +``` + +Create a `values.yaml` file with the key settings required for teamware. The following is a minimal set of values for a typical installation: + +```yaml +# Public-facing web hostname of the teamware application, the public +# URL will be https://{hostName} +hostName: teamware.example.com + +email: + # "From" address on emails sent by Teamware + adminAddress: admin@teamware.example.com + # Send email via an SMTP server - alternatively "gmail" to use GMail API + backend: "smtp" + smtp: + host: mail.example.com + # You will also need to set user and passwordSecret if your + # mail server requires authentication + +privacyPolicy: + # Contact details of the host and administrator of the teamware + # instance, if no admin defined, defaults to the host values. + host: + # Name of the host + name: "Service Host" + # Host's physical address + address: "123 Example Street, City. Country." + # A method of contacting the host, field supports HTML for e.g. linking to a form + contact: "Email" + admin: + name: "Dr. Service Admin" + address: "Department of Example Studies, University of Example, City. Country." + contact: "Email" + +backend: + # Name of the random secret you created above + djangoSecret: django-secret + +# Initial "super user" created on the first install. These are just +# the *initial* settings, you can (and should!) change the password +# once Teamware is up and running +superuser: + email: me@example.com + username: admin + password: changeme +``` + +Some of these may be omitted or others may be required depending on the setup of your specific cluster - see the [chart README](https://github.com/GateNLP/charts/blob/main/gate-teamware/README.md) and the chart's own values file (which you can retrieve with `helm show values gate/gate-teamware`) for full details. In particular these values assume: + +- your cluster has an ingress controller, with a default ingress class configured, and that controller has a default TLS certificate that is compatible with your chosen hostname (e.g. a `*.example.com` wildcard) +- your cluster has a default storageClass configured to provision PVCs, and at least 8 GB of available PV capacity +- you can send email via an SMTP server with no authentication +- the default GATE Teamware terms and privacy documents are suitable for your deployment and compliant with the laws of your location. If this is not the case you can supply your own custom policy documents in a ConfigMap +- you do not need to back up your PostgreSQL database - the chart does include the option to store backups in Amazon S3 or another compatible object store, see the full README for details + +Once you have created your values file, you can install the chart or upgrade an existing installation using + +``` +helm upgrade --install gate-teamware gate/gate-teamware \ + --namespace {namespace} --values teamware-values.yaml +``` + + +## Bug reports and feature requests +Please make bug reports and feature requests as Issues on the [GATE Teamware GitHub repo](https://github.com/GATENLP/gate-teamware). + +# Using Teamware +Teamware is developed by the [GATE](https://gate.ac.uk) team, an academic research group at The University of Sheffield. As a result, future funding relies on evidence of the impact that the software provides. If you use Teamware, please let us know using the contact form at [gate.ac.uk](https://gate.ac.uk/g8/contact). Please include details on grants, publications, commercial products etc. Any information that can help us to secure future funding for our work is greatly appreciated. + +## Citation +For published work that has used Teamware, please cite the [EACL23 demo paper](https://aclanthology.org/2023.eacl-demo.17/). One way is to include a citation such as: + +> Wilby, D., Karmakharm, T., Roberts, I., Song, X. & Bontcheva, K. (2023). GATE Teamware 2: An open-source tool for collaborative document classification annotation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 145–151, Dubrovnik, Croatia. Association for Computational Linguistics. https://aclanthology.org/2023.eacl-demo.17/ + +Please use the `Cite this repository` button at the top of the [project's GitHub repository](https://github.com/GATENLP/gate-teamware) to get an up to date citation. + +Permanent references to each version of the software are available from [Zenodo](https://doi.org/10.5281/zenodo.7899193). The Teamware version can be found on the 'About' page of your Teamware instance. diff --git a/docs/versioned/2.2.0/annotatorguide/README.md b/docs/versioned/2.2.0/annotatorguide/README.md new file mode 100644 index 00000000..4a3ddae2 --- /dev/null +++ b/docs/versioned/2.2.0/annotatorguide/README.md @@ -0,0 +1,27 @@ +# Annotators Quickstart + +Annotating a project: + +* After signing up to the site, notify the owner of the annotation project you've been recruited of + your username. This will allow them to add you as an annotator to a project. +* After you've been recruited to a project, click on the `Annotate` link on the navigation bar at the + top of the page to start annotating. +* You will be shown the details about the project you're annotating along with a set of form(s) to capture + your annotation. Ensure you've read the Annotator guideline fully before starting the annotation process. +* You can then start annotating documents one at a time. Click on `Submit` to confirm the completion of + annotation, `Clear` to start again or `Reject` to skip the particular document. Be aware some projects + do not allow you to skip documents. +* Once you've finished annotating a certain number of documents in a project (specified by the project + manager) your task will be deemed complete, and you will be able to be recruited into another annotation + project. + +## Deleting your account + +At any time you can choose to stop participating and delete your account. You can do this by: + +* Click on your username in the top right corner and then `Account`. +* Click on `Delete my account`. +* When deleting your account, by default your personal information will be removed but your annotations will remain on the system. To completely remove all of your annotations, click on the checkbox next to `Also remove any annotations, projects and documents that I own:`. +* Click the `Unlock` button. +* Then click `Delete` to remove your account. + diff --git a/docs/versioned/2.2.0/developerguide/README.md b/docs/versioned/2.2.0/developerguide/README.md new file mode 100644 index 00000000..02e4494f --- /dev/null +++ b/docs/versioned/2.2.0/developerguide/README.md @@ -0,0 +1,286 @@ +# Developer guide + +## Architecture +``` +├── .github/workflows/ # github actions workflow files +├── teamware/ # Django project +│   └── settings/ +├── backend/ # Django app +├── cypress/ # integration test configurations +├── docs/ # documentation +├── examples/ # example data files +├── frontend/ # all frontend, in VueJS framework +├── nginx/ # Nginx configurations +| +# Top level directory contains scripts for management and deployment, +# main project package.json, python requirements, docker configs +├── build-images.sh +├── deploy.sh +├── create-django-db.sh +├── docker-compose.yml +├── Dockerfile +├── generate-docker-env.sh +├── manage.py +├── migrate-integration.sh +├── package.json +├── package-lock.json +├── pytest.ini +├── README.md +├── requirements-dev.txt +├── requirements.txt +└── run-server.sh + +``` + +## Installation for development + +The service depends on a combination of python and javascript libraries. We recommend developing inside a `conda` conda environment as it is able to install +python libraries and nodejs which is used to install javascript libraries. + +* Install anaconda/miniconda +* Create a blank virtual conda env + ```bash + $ conda create -n teamware python=3.9 + ``` +* Activate conda environment + ```bash + $ source activate teamware + # or + $ conda activate teamware + ``` +* Install python dependencies in conda environment using pip + ```bash + (teamware)$ pip install -r requirements.txt -r requirements-dev.txt + ``` +* Install nodejs, postgresql and openssl in the conda environment + ```bash + (teamware)$ conda install -y -c conda-forge postgresql=14.* + (teamware)$ conda install -y -c conda-forge nodejs=18.* + ``` +* Install nodejs dependencies + ```bash + (teamware)$ npm install + ``` + +Set up a new postgreSQL database and user for development: +``` +# Create a new directory for the db data and initialise +mkdir -p pgsql/data +initdb -D pgsql/data + +# Launch postgres in the background +postgres -p 5432 -D pgsql/data & + +# Create a DB user, you'll be prompted to input password, "password" is the default in teamware/settings/base.py for development +createuser -p 5432 -P user --createdb + +# Create a rumours_db with rumours as user +createdb -p 5432 -O user teamware_db + +# Migrate & create database tables +python manage.py migrate + +# create a new superuser - when prompted enter a username and password for the db superuser +python manage.py createsuperuser +``` + +## Updating packages +To update packages after a merge, run the following commands: + +```bash +# Activate the conda environment +source activate teamware +# Update any packages changed in the python requirements.txt and requirements-dev.txt files +pip install -r requirements.txt -r requirements-dev.txt +# Update any packages changed in package.json +npm install +``` + +## Development server +The application uses django's dev server to serve page contents and run the RPC API, it also uses Vue CLI's +development server to serve dynamic assets such as javascript or stylesheets allowing for hot-reloading +during development. + +To run both servers together: + + ```bash + npm run serve + ``` + +To run separately: + +* Django server + ```bash + npm run serve:backend + ``` +* Vue CLI dev server + ```bash + npm run serve:frontend + ``` + +## Deploying a development version using Docker +Deployment is via [docker-compose](https://docs.docker.com/compose/), using [NGINX](https://www.nginx.com/) to serve static content, a separate [postgreSQL](https://hub.docker.com/_/postgres) service containing the database and a database backup service (see `docker-compose.yml` for details). Pre-built images can be run using most versions of Docker but _building_ images requires `docker buildx`, which means either Docker Desktop or version 19.03 or later of Docker Engine. + +1. Run `./generate-docker-env.sh` to create a `.env` file containing randomly generated secrets which are mounted as environment variables into the container. See [below](#env-config) for details. + +2. Then build the images via: + ```bash + ./build-images.sh + ``` + +3. then deploy the stack with + + ```bash + ./deploy.sh production # (or prod) to deploy with production settings + ./deploy.sh staging # (or stag) to deploy with staging settings + ``` + +To bring the stack down, run `docker-compose down`, using the `-v` flag to destroy the database volume (be careful with this). + +### Configuration using environment variables (.env file) + +To allow the app to be easily configured between instances especially inside containers, many of the app's configuration can be done through environment variables. + +Run `./generate-docker-env.sh` to generate a `.env` file with all configurable environment parameters. + +To set values for your own deployment, add values to the variables in `.env`, most existing values will be kept after running `generate-docker-env.sh`, see comments in `.env` for specific details. Anything that is left blank will be filled with a default value. Passwords and keys are filled with auto-generated random values. + +Existing `.env` files are copied into a new file named `saved-env.` by `generate-docker-env.sh`. + +### Backups + +In a docker-compose based deployment, backups of the database are managed by the service `pgbackups` which uses the [`prodrigestivill/postgres-backup-local:12`](https://hub.docker.com/r/prodrigestivill/postgres-backup-local) image. +By default, backups are taken of the database daily, and the `docker-compose.yml` contains settings for the number of backups kept under the options for the `pgbackups` service. +Backups are stored as a gzipped SQL dump from the database. + +#### Taking a manual backup + +A shell script is provided for manually triggering a backup snapshot. +From the main project directory run + +```sh +$ ./backup_manual.sh +``` + +This uses the `pgbackups` service and all settings and envrionment variables it is configured with in `docker-compose.yml`, so backups will be taken to the same location as configured for the main backup schedule. + +#### Restoring from a backup +1. Locate the backup file (`*.sql.gz`) on your system that you would like to restore from. +2. Make sure that the stack is down, from the main project directory run `docker-commpose down`. +3. Run the backup restore shell script, passing in the path to your backup file as the only argument: + +```sh +$ ./backup_restore.sh path/to/my/backup.sql.gz +``` + +This will first launch the database container, then via Django's `dbshell` command, running in the `backend` service, execute a number of SQL commands before and after running all the SQL from the backup file. + +4. Redeploy the stack, via `./deploy.sh staging`, `./deploy.sh production`, or simply `docker compose up -d`, whichever is the case. +5. The database *should* be restored. + +## Configuration + +### Django settings files + +Django settings are located in `teamware/settings` folder. The app will use `base.py` setting by default +and this must be overridden depending on use. + +### Database +A SQLite3 database is used during development and during integration testing. + +For staging and production, postgreSQL is used, running from a `postgres-14` docker container. Settings are found in `teamware/settings/base.py` and `deployment.py` as well as being set as environment variables by `./generate-docker-env.sh` and passed to the container as configured in `docker-compose.yml`. + +In Kubernetes deployments the PostgreSQL database is installed using the Bitnami `postresql` public chart. + + +### Sending E-mail +It's recommended to specify e-mail configurations through environment variables (`.env`). As these settings will include username and passwords that should not be tracked by version control. + +#### E-mail using SMTP +SMTP is supported as standard in Django, add the following configurations with your own details +to the list of environment variables: + +```bash +DJANGO_EMAIL_BACKEND='django.core.mail.backends.smtp.EmailBackend' +DJANGO_EMAIL_HOST='myserver.com' +DJANGO_EMAIL_PORT=25 +DJANGO_EMAIL_HOST_USER='username' +DJANGO_EMAIL_HOST_PASSWORD='password' +DJANGO_EMAIL_SECURITY=tls +# tls = STARTTLS, typically on port 25 or 587 +# ssl = TLS-on-connect, typically on port 465 +# none (or omitted) = no encryption +``` + +#### E-mail using Google API +The [django-gmailapi-backend](https://github.com/dolfim/django-gmailapi-backend) library +has been added to allow sending of mail through Google's API as sending through SMTP is disabled as standard. + +Unlike with SMTP, Google's API requires OAuth authentication which means a project and a credential has to be +created through Google's cloud console. + +* More information on the Gmail API: [https://developers.google.com/gmail/api/guides/sending](https://developers.google.com/gmail/api/guides/sending) +* OAuth credentials for sending emails: [https://github.com/google/gmail-oauth2-tools/wiki/OAuth2DotPyRunThrough](https://github.com/google/gmail-oauth2-tools/wiki/OAuth2DotPyRunThrough) + +This package includes the script linked in the documentation above, which simplifies the setup of the API credentials. The following outlines the key steps: + +1. Create a project in the Google developer console, [https://console.cloud.google.com/](https://console.cloud.google.com/) +2. Enable the Gmail API +3. Create OAuth 2.0 credentials, you'll likely want to create a `Desktop` +4. Create a valid refresh_token using the helper script included in the package: + ```bash + gmail_oauth2 --generate_oauth2_token \ + --client_id="" \ + --client_secret="" \ + --scope="https://www.googleapis.com/auth/gmail.send" + ``` +5. Add the created credentials and tokens to the environment variable as shown below: + ```bash + DJANGO_EMAIL_BACKEND='gmailapi_backend.mail.GmailBackend' + DJANGO_GMAIL_API_CLIENT_ID='google_assigned_id' + DJANGO_GMAIL_API_CLIENT_SECRET='google_assigned_secret' + DJANGO_GMAIL_API_REFRESH_TOKEN='google_assigned_token' + ``` + + +#### Teamware Privacy Policy and Terms & Conditions + +Teamware includes a default privacy policy and terms & conditions, which are required for running the application. + +The default privacy policy is intended to be compliant with UK GDPR regulations, which may comply with the rights of users of your deployment, however it is your responsibility to ensure that this is the case. + +If the default privacy policy covers your use case, then you will need to include configuration for a few contact details. + +Contact details are required for the **host** and the **administrator**: the **host** is the organisation or individual responsible for managing the deployment of the teamware instance and the **administrator** is the organisation or individual responsible for managing users, projects and data on the instance. In many cases these roles will be filled by the same organisation or individual, so in this case specifying just the **host** details is sufficient. + +For deployment from source, set the following environment variables: + +* `PP_HOST_NAME` +* `PP_HOST_ADDRESS` +* `PP_HOST_CONTACT` +* `PP_ADMIN_NAME` +* `PP_ADMIN_ADDRESS` +* `PP_ADMIN_CONTACT` + +For deployment using docker-compose, set these values in `.env`. + +If the host and administrator are the same, you can just set the `PP_HOST_*` variables above which will be used for both. + +##### Including a custom Privacy Policy and/or Terms & Conditions + +If the default privacy policy or terms & conditions do not cover your use case, you can easily replace these with your own documents. + +If deploying from source, include markdown (`.md`) files in a `custom-policies` directory in the project root with the exact names `custom-policies/privacy-policy.md` and/or `custom-policies/terms-and-conditions.md` which will be rendered at the corresponding pages on the running web app. If you are not familiar with the Markdown language there are a number of free WYSIWYG-style editor tools available including [StackEdit](https://stackedit.io/app) (browser based) and [Zettlr](https://www.zettlr.com) (desktop app). + +If deploying with docker compose, place the `custom-policies` directory at the same location as the `docker-compose.yml` file before running `./deploy.sh` as above. + +An example custom privacy policy file contents might look like: + +```md +# Organisation X Teamware Privacy Policy +... +... +## Definitions of Roles and Terminology +... +... +``` diff --git a/docs/versioned/2.2.0/developerguide/api_docs.md b/docs/versioned/2.2.0/developerguide/api_docs.md new file mode 100644 index 00000000..03964b37 --- /dev/null +++ b/docs/versioned/2.2.0/developerguide/api_docs.md @@ -0,0 +1,1086 @@ +--- +sidebarDepth: 3 +--- + +# API Documentation + +## Using the JSONRPC endpoints + +::: tip +A single endpoint is used for all API requests, located at `/rpc` +::: + +The API used in the app complies to JSON-RPC 2.0 spec. Requests should always be sent with `POST` and +contain a JSON request object in the body. The response will also be in the form of a JSON object. + +For example, to call the method `subtract(a, b)`. Send `POST` a post request to `/rpc` with the following JSON +in the body: + +```json +{ + "jsonrpc":"2.0", + "method":"subtract", + "params":[ + 42, + 23 + ], + "id":1 +} +``` + +Variables are passed as a list to the `params` field, in this case `a=42` and `b=23`. The `id` field in the top +level of the request object refers to the message ID, this ID value will be matched in the response, +it does not affect the method that is being called. + +The response will be as follows: + +```json +{ + "jsonrpc":"2.0", + "result":19, + "id":1 +} +``` + +In the case of errors, the response will contain an `error` field with error `code` and error `message`: + +```json +{ + "jsonrpc":"2.0", + "error":{ + "code":-32601, + "message":"Method not found" + }, + "id":"1" +} +``` + +The following are error codes used in the app: + +```python +PARSE_ERROR = -32700 +INVALID_REQUEST = -32600 +METHOD_NOT_FOUND = -32601 +INVALID_PARAMS = -32602 +INTERNAL_ERROR = -32603 +AUTHENTICATION_ERROR = -32000 +UNAUTHORIZED_ERROR = -32001 +``` + +## API Listing + + + +### initialise() + + +::: tip Description +Provide the initial context information to initialise the Teamware app + + context_object: + user: + isAuthenticated: bool + isManager: bool + isAdmin: bool + configs: + docFormatPref: bool + global_configs: + allowUserDelete: bool +::: + + + + + + + + +### is_authenticated() + + +::: tip Description +Checks that the current user has logged in. +::: + + + + + + + + +### login(payload) + + + + +#### Parameters + +* payload + + + + + + + +### logout() + + + + + + + + + +### register(payload) + + + + +#### Parameters + +* payload + + + + + + + +### generate_user_activation(username) + + + + +#### Parameters + +* username + + + + + + + +### activate_account(username,token) + + + + +#### Parameters + +* username + +* token + + + + + + + +### generate_password_reset(username) + + + + +#### Parameters + +* username + + + + + + + +### reset_password(username,token,new_password) + + + + +#### Parameters + +* username + +* token + +* new_password + + + + + + + +### change_password(payload) + + + + +#### Parameters + +* payload + + + + + + + +### change_email(payload) + + + + +#### Parameters + +* payload + + + + + + + +### set_user_receive_mail_notifications(do_receive_notifications) + + + + +#### Parameters + +* do_receive_notifications + + + + + + + +### set_user_document_format_preference(doc_preference) + + + + +#### Parameters + +* doc_preference + + + + + + + +### get_user_details() + + + + + + + + + +### get_user_annotated_projects() + + +::: tip Description +Gets a list of projects that the user has annotated +::: + + + + + + + + +### get_user_annotations_in_project(project_id,current_page,page_size) + + +::: tip Description +Gets a list of documents in a project where the user has performed annotations in. + :param project_id: The id of the project to query + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* project_id + +* current_page + +* page_size + + + + + + + +### user_delete_personal_information() + + + + + + + + + +### user_delete_account() + + + + + + + + + +### create_project() + + + + + + + + + +### delete_project(project_id) + + + + +#### Parameters + +* project_id + + + + + + + +### update_project(project_dict) + + + + +#### Parameters + +* project_dict + + + + + + + +### get_project(project_id) + + + + +#### Parameters + +* project_id + + + + + + + +### clone_project(project_id) + + + + +#### Parameters + +* project_id + + + + + + + +### import_project_config(pk,project_dict) + + + + +#### Parameters + +* pk + +* project_dict + + + + + + + +### export_project_config(pk) + + + + +#### Parameters + +* pk + + + + + + + +### get_projects(current_page,page_size,filters) + + +::: tip Description +Gets the list of projects. Query result can be limited by using current_page and page_size and sorted + by using filters. + + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :param filters: Filter option used to search project, currently only string is used to search + for project title + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* current_page + +* page_size + +* filters + + + + + + + +### get_project_documents(project_id,current_page,page_size,filters) + + +::: tip Description +Gets the list of documents and its annotations. Query result can be limited by using current_page and page_size + and sorted by using filters + + :param project_id: The id of the project that the documents belong to, is a required variable + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :param filters: Filter currently only searches for ID of documents + for project title + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* project_id + +* current_page + +* page_size + +* filters + + + + + + + +### get_project_test_documents(project_id,current_page,page_size,filters) + + +::: tip Description +Gets the list of documents and its annotations. Query result can be limited by using current_page and page_size + and sorted by using filters + + :param project_id: The id of the project that the documents belong to, is a required variable + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :param filters: Filter currently only searches for ID of documents + for project title + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* project_id + +* current_page + +* page_size + +* filters + + + + + + + +### get_project_training_documents(project_id,current_page,page_size,filters) + + +::: tip Description +Gets the list of documents and its annotations. Query result can be limited by using current_page and page_size + and sorted by using filters + + :param project_id: The id of the project that the documents belong to, is a required variable + :param current_page: A 1-indexed page count + :param page_size: The maximum number of items to return per query + :param filters: Filter currently only searches for ID of documents + for project title + :returns: Dictionary of items and total count after filter is applied {"items": [], "total_count": int} +::: + + + +#### Parameters + +* project_id + +* current_page + +* page_size + +* filters + + + + + + + +### add_project_document(project_id,document_data) + + + + +#### Parameters + +* project_id + +* document_data + + + + + + + +### add_project_test_document(project_id,document_data) + + + + +#### Parameters + +* project_id + +* document_data + + + + + + + +### add_project_training_document(project_id,document_data) + + + + +#### Parameters + +* project_id + +* document_data + + + + + + + +### add_document_annotation(doc_id,annotation_data) + + + + +#### Parameters + +* doc_id + +* annotation_data + + + + + + + +### get_annotations(project_id) + + +::: tip Description +Serialize project annotations as GATENLP format JSON using the python-gatenlp interface. +::: + + + +#### Parameters + +* project_id + + + + + + + +### delete_documents_and_annotations(doc_id_ary,anno_id_ary) + + + + +#### Parameters + +* doc_id_ary + +* anno_id_ary + + + + + + + +### get_possible_annotators(proj_id) + + + + +#### Parameters + +* proj_id + + + + + + + +### get_project_annotators(proj_id) + + + + +#### Parameters + +* proj_id + + + + + + + +### add_project_annotator(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### make_project_annotator_active(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### project_annotator_allow_annotation(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### remove_project_annotator(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### reject_project_annotator(proj_id,username) + + + + +#### Parameters + +* proj_id + +* username + + + + + + + +### get_annotation_timings(proj_id) + + + + +#### Parameters + +* proj_id + + + + + + + +### delete_annotation_change_history(annotation_change_history_id) + + + + +#### Parameters + +* annotation_change_history_id + + + + + + + +### get_annotation_task() + + +::: tip Description +Gets the annotator's current task, returns a dictionary about the annotation task that contains all the information + needed to render the Annotate view. +::: + + + + + + + + +### get_annotation_task_with_id(annotation_id) + + +::: tip Description +Get annotation task dictionary for a specific annotation_id, must belong to the annotator (or is a manager or above) +::: + + + +#### Parameters + +* annotation_id + + + + + + + +### complete_annotation_task(annotation_id,annotation_data,elapsed_time) + + +::: tip Description +Complete the annotator's current task +::: + + + +#### Parameters + +* annotation_id + +* annotation_data + +* elapsed_time + + + + + + + +### reject_annotation_task(annotation_id) + + +::: tip Description +Reject the annotator's current task +::: + + + +#### Parameters + +* annotation_id + + + + + + + +### change_annotation(annotation_id,new_data) + + +::: tip Description +Adds annotation data to history +::: + + + +#### Parameters + +* annotation_id + +* new_data + + + + + + + +### get_document(document_id) + + +::: tip Description +Obsolete: to be deleted +::: + + + +#### Parameters + +* document_id + + + + + + + +### get_annotation(annotation_id) + + +::: tip Description +Obsolete: to be deleted +::: + + + +#### Parameters + +* annotation_id + + + + + + + +### annotator_leave_project() + + +::: tip Description +Allow annotator to leave their currently associated project. +::: + + + + + + + + +### get_all_users() + + + + + + + + + +### get_user(username) + + + + +#### Parameters + +* username + + + + + + + +### admin_update_user(user_dict) + + + + +#### Parameters + +* user_dict + + + + + + + +### admin_update_user_password(username,password) + + + + +#### Parameters + +* username + +* password + + + + + + + +### admin_delete_user_personal_information(username) + + + + +#### Parameters + +* username + + + + + + + +### admin_delete_user(username) + + + + +#### Parameters + +* username + + + + + + + +### get_privacy_policy_details() + + + + + + + + + +### get_endpoint_listing() + + + + + + + + + + + diff --git a/docs/versioned/2.2.0/developerguide/documentation.md b/docs/versioned/2.2.0/developerguide/documentation.md new file mode 100644 index 00000000..3de09904 --- /dev/null +++ b/docs/versioned/2.2.0/developerguide/documentation.md @@ -0,0 +1,61 @@ +# Managing and versioning documentation + +Documentation versioning is managed by the custom node script located at `docs/manage_versions.js`. Versions of the documentation can be archived and the entire documentation site can be built using the script. + +Various configuration parameters used for management of documentation versioning can be found in `docs/docs.config.js`. + +## Installing dependencies required to serve the documentation site + +The documentation uses vuepress and other libraries which has to be installed separately running the following command from the root of the project: + +```bash +npm run install:docs +``` + +## Editing the documentation + +The latest version of the documentation is located at `/docs/docs`. The archived (versioned) documentation are located in `/docs/versioned/version_number`. + +Use the following command to live preview the latest version of the documentation: + +``` +npm run serve:docs +``` + +Note that this will not work with other versioned docs as they are managed as a separate site. To live preview versioned documentation use the command (replace version_num with the version you'd like to preview): + +``` +vuepress dev docs/versioned/version_num +``` + +## Creating a new documentation version + +To create a version of the documentation, run the command: + +``` +npm run docs:create_version +``` + +This creates a copy of the current set of documentation in `/docs/docs` and places it at `/docs/versioned/version_num`. The version number in `package.json` is used for the documentation version. + +Each set of documentation can be considered as a separate vuepress site. Each one has a `.vuepress/versions.json` file that contains the listing of all versions, allowing them to link to each other. + +Note: Versions can also be created manually by running the command: + +``` +# Replace version_num with the version you'd like to create +node docs/manage_versions.js create version_num +``` + + +## Building documentation site + +To build the documentation site, the previous documentation build command is used: + +``` +npm run build:docs +``` + +## Implementation of the version selector UI + +A partial override of the default Vuepress theme was needed to add a custom component the navigation bar. The modified version of the `NavBar` component can be found in `/docs/docs/.vuepress/theme/components/NavBar.vue`. The modified NavBar uses the `VersionSelector` (`/docs/docs/.vuepress/theme/components/VersionSelector.vue`) component which reads from the `.vuepress/versions.json` from each set of documentation. diff --git a/docs/versioned/2.2.0/developerguide/frontend.md b/docs/versioned/2.2.0/developerguide/frontend.md new file mode 100644 index 00000000..ae2aeff0 --- /dev/null +++ b/docs/versioned/2.2.0/developerguide/frontend.md @@ -0,0 +1,146 @@ +# Frontend + +Web GUI of Teamware is built with [vue.js](https://vuejs.org) version 2.7.x. + +[Bootstrap](https://getbootstrap.com/) (and [Bootstrap vue](https://bootstrap-vue.org/)) provides the visual styling. + +[Vite.js](https://vitejs.dev/) is used to bundle Vue code and other javascript dependencies for deployment and serve as a frontend dev server (which runs alongside django dev server) while testing or debugging. + +## Getting started + +### Installation +``` +npm install +``` + +### Compiles and hot-reloads for development +``` +npm run serve +``` + +### Compiles and minifies for production +``` +npm run build +``` + +### Testing + +**Tools used for testing:** +* [vitest](https://vitest.dev) - Used for unit testing (code without UI components) +* [cypress](https://docs.cypress.io) - Used for tests that contains (Vue) UI components +* [Vue test utils](https://vue-test-utils.vuejs.org) - Used for rendering vue component allows it to be mounted for unit testing. Officially recommended by Vue.js. + +* Tests for the frontend are all located in `/frontend/tests` folder. + * Unit test files should all be placed in `/frontend/tests/unit/` folder and have an extension `.spec.js`. + * Component test files should all be placed in `/frontend/tests/component` folder and have an extension `.cy.js` +* Test fixtures (data used in running the tests) are placed in `/examples` folder, this folder is shared with the integration test + +To run all frontend tests (unit and component tests): + +``` +npm run test +``` + +To run unit tests only: + +``` +npm run test:unit +``` + +To run component test only: + +``` +npm run test:component +``` + +## Notes when coming from the previous version <=2.0.0 + +- The `@` alias can still be used when doing module imports but file extensions should now be used when importing `.vue` files e.g. + - Before: `import DeleteModal from "@/components/DeleteModal" + - Now: `import DeleteModal from "@/components/DeleteModal.vue"` +- For code that is intended to run on the browser, e.g. in all `.vue` files, imports should use the ES 6 compliant `import` command and not node/commonjs's `require` + - **Exceptions for code that is run directly by node**, e.g. scripts used in the build chain, config files and test files used by build tools that run on node (e.g. vuepress or cypress) + + +## Explantion of the frontend + +### Vue and Vite + +Instead of separating html, css and javascript files, Vue has its own `single-file component` format normally with `.vue` extension ([reason why this file format is used](https://vuejs.org/guide/scaling-up/sfc.html)). Here is an example `.vue` file: + +```vue + + + + + +``` + +This means that `.vue` files cannot be directly imported into a standard html page. A tool has to be used for converting `.vue` file into standard javascript and/or css files, this is where [Vite.js](https://vitejs.dev/) comes in. + +[Vite.js](https://vitejs.dev/) is a tool that, amongst many other things, provides a dev server allowing hot module replacement (ability to immediately see changes in the UI during development) and bundling of javascript modules and other resources (css, images, etc.) i.e. not having to individually import each javascript and their dependencies from the main page. A [Vue plugin](https://github.com/vitejs/vite-plugin-vue2) is used to automatically convert `.vue` files into plain javascript as part of the bundling process. + +### App entrypoint (main.js) and routing + +The application's main entrypoint is `/frontend/src/main.js` which loads dependencies like Vue, Bootstrap Vue as well as loading the main component `AnnotationApp.vue` into a html page that contains a `
` tag. + +The `AnnotationApp.vue` component contains the special `` tag ([vue router](https://router.vuejs.org/)) which allows us to map url paths to specific vue components. The routing configuration can be found in `/frontend/src/router/index.js`, for example: + +```js +const routes = [ + { + path: '/', + name: 'Home', + component: Home, + meta: {guest: true}, + }, +... +``` + +The route shown above maps the root path e.g. `https://your-deployed-teamware-domain.com/` to the `Home.vue` component. Specifically, when pointing your browser to that path, the `Home.vue` component is inserted inside ``. + +### index.html, templates and bundling + +A html page is required to place our application in. Teamware uses Django to serve up the main html page which is located at `/frontend/templates/index.html` (see `MainView` class in `/backend/views.py`). This `index.html` page has to know where to load the generated javascript files. Where these files are differ depending on whether you're running the vite development server or using vite's statically built files. + +#### Using vite's development server (Django's `settings.FRONTEND_DEV_SERVER_USE` is `True`) +In during development we expect to be running the vite dev server alongside django server (when running `npm run serve` from the root of the project). In this case `index.html` imports javascript directly from the vite dev server: + +```html + + +``` + +This applies when running the `base`, `test` and `integration` django configurations. + +#### Using vite's statically built assets (Django's `settings.FRONTEND_DEV_SERVER_USE` is `false`) +When deploying the application, vite converts `.vue` files into plain javascript and bundles them to `/frontend/dist/static` directory. The `/frontend/src/main.js` becomes `/frontend/dist/static/assets/main-bb58d055.js`. The scripts are imported as static asset of going through the vite server, for example: + +```html + + +``` + +This applies when running the `deployment`, `docker-test` and `docker-integration` django configurations. + +#### index.html generation + +You may have noticed that a hash is added to the generated asset files (e.g. `main-bb58d055.js`) and this hash changes every time Vite builds the code. This means the `index.html` must also be re-generated after every Vite build as well. + +A simple build script which runs after every vite build `/frontend/build_template.js` performs this generation by taking the base template `/frontend/base_index.html`, merging it with Vite's generated manifest `/frontend/dist/manifest.json` and the output with the correct import path to `/frontend/templates/index.html`. + diff --git a/docs/versioned/2.2.0/developerguide/releases.md b/docs/versioned/2.2.0/developerguide/releases.md new file mode 100644 index 00000000..8a08d12b --- /dev/null +++ b/docs/versioned/2.2.0/developerguide/releases.md @@ -0,0 +1,18 @@ +# Managing Releases + +*These instructions are primarily intended for the maintainers of Teamware.* + +Note: Releases are always made from the `master` branch of the repository. + +## Steps to making a release + +1. **Update the changelog** - This has to be done manually, go through any pull requests to `dev` since the last release. + - In github pull requests page, use the search term `is:pr merged:>=yyyy-mm-dd` to find all merged PR from the date since the last version change. + - Include the changes in the `CHANGELOG.md` file; the changelog section _MUST_ begin with a level-two heading that starts with the relevant version number in square brackets (`## [N.M.P] Optional descriptive suffix`) as the GitHub workflow that creates a release from the eventual tag depends on this pattern to find the right release notes. Each main item within the changelog should have a link to the originating PR e.g. \[#123\](https://github.com/GateNLP/gate-teamware/pull/123). +1. **Update and check the version numbers** - from the teamware directory run `python version.py check` to check whether all version numbers are up to date. If not, update the master `VERSION` file and run `python version.py update` to update all other version numbers and commit the result. Alternatively, run `python version.py update ` where `` is the version number to update to, e.g. `python version.py update 2.1.0`. Note that `version.py` requires `pyyaml` for reading `CITATION.cff`, `pyyaml` is included in Teamware's dependencies. +1. **Create a version of the documentation** - Run `npm run docs:create_version`, this will archive the current version of the documentation using the version number in `package.json`. +1. **Create a pull request from `dev` to `master`** including any changes to `CHANGELOG.md`, `VERSION`. +1. **Create a tag** - Once the dev-to-master pull request has been merged, create a tag from the resulting `master` branch named `vN.M.P` (i.e. the new version number prefixed with the letter `v`). This will trigger two GitHub workflows: + - one that builds versioned Docker images for this release and pushes them to `ghcr.io`, updating the `latest` image tag to point to the new release + - one that creates a "release" on GitHub with the necessary artifacts to make the `https://gate.ac.uk/get-teamware.sh` installation mechanism work correctly. The release notes for this release will be generated by extracting the matching section from `CHANGELOG.md`. +1. **Update the Helm chart** - Create a new branch on [https://github.com/GateNLP/charts](https://github.com/GateNLP/charts) to update the `appVersion` of the `gate-teamware` Helm chart to match the version that was just created by the tag workflow. You must also update the chart `version`, bumping the major version number if the new chart is not backwards-compatible with the old. Submit a pull request to the `main` branch, which will publish the new chart when it is merged. diff --git a/docs/versioned/2.2.0/developerguide/testing.md b/docs/versioned/2.2.0/developerguide/testing.md new file mode 100644 index 00000000..578501b2 --- /dev/null +++ b/docs/versioned/2.2.0/developerguide/testing.md @@ -0,0 +1,182 @@ +# Testing +All the tests can be run using the following command: + +```bash +npm run test +``` + +## Backend Testing +Pytest is used for testing the backend. + +```bash +npm run test:backend +``` + +### Backend test files + +* Unit test files are located in `/backend/tests` + +## Frontend testing +[Jest](https://jestjs.io/) is used for frontend testing. +The [Vue testing-library](https://testing-library.com/docs/vue-testing-library/intro/) is used for testing +Vue components. + +```bash +npm run test:frontend +``` + +### Frontend test files + +* Frontend test files are located in `/fontend/tests/unit` and should the extension `.spec.js` + +### Testing JS functions + +```javascript +describe("Description of a group of tests to be run", () =>{ + + beforeAll(() =>{ + //The code here is run before each test + }) + + it("A single test's description", async () =>{ + + // Assertions are done with the expect() function e.g. + let funcOutput = 30 + 10 + expect(funcOutput).toBe(40) + + + }) +}) + +``` + +### Mocking JS classes + +This is an example of a mock harness for the JRPCClient class. + +A mock file is created inside a ``__mock__`` directory placed next to the file that's being mocked, e.g. +for our JRPCClient class at `/frontend/src/jrpc/index.js`, the mock file is `/frontend/src/jrpc/__mock__/index.js`. + + +Inside the mock file `/frontend/src/jrpc/__mock__/index.js`: +```javascript +// Mocking jrpc/index.js +//Mocking the JRPCClient class +//Replacing the call function with a custom mockCall function +export const mockCall = jest.fn(()=> 30); +const mock = jest.fn().mockImplementation(() => { + return {call: mockCall}; +}); + +export default mock; +``` + + +Inside the test file `*.spec.js`: +```javascript +import JRPCClient from "@/jrpc"; +jest.mock('@/jrpc') + +import store from '@/store' +//Example on how to mock the jrpc call + +describe("Vuex functions testing", () =>{ + + beforeAll(() =>{ + + //Re-implement custom mock call implementation if needed + JRPCClient.mockImplementation(()=>{ + return { + call(){ + return 50 + } + } + }) + + }) + + it("testfunc", async () =>{ + + const noutput = await store.dispatch("testnormal") + expect(noutput).toBe("Hello world") + + const aoutput = await store.dispatch("testasync") + expect(aoutput).toBe("Hello world") + + const rpc = new JRPCClient("/") + const result = await rpc.call("some param") + expect(result).toBe(50) + + }) +}) +``` + +### Testing Vue components + + +```javascript +//Example of how a component could be tested +import { render, fireEvent } from '@testing-library/vue' + + +import HelloWorld from '@/components/HelloWorld.vue' + +//Testing a component e.g. HelloWorld +describe('HelloWorld.vue', () => { + + it('renders props.msg when passed', () => { + const msg = 'new message' + const { getByText } = render(HelloWorld) + + getByText("Installed CLI Plugins") + }) +}) + +``` + + +## Integration testing +[Cypress](https://www.cypress.io/) is used for integration testing. + +The integration settings are located at `teamware/settings/integration.py` + +To run the integration test: +```bash +npm run test:integration +``` + +The test can also be run in **interactive mode** using: + +```bash +npm run serve:cypressintegration +``` + +### Integration test files +Files related to integration testing are located in `/cypress` + +* Test files are located in the `/cypress/integration` directory and should have the extension `.spec.js`. + +### Re-seeding the database + +The command `npm run migrate:integration` resets the database and performs migration, use with `beforeEach` to run it +before every test case in a suite: + +```js +describe('Example test suite', () => { + + beforeEach(() => { + // Resets the database every time before + // the test is run + cy.exec('npm run migrate:integration') + }) + + it('Test case 1', () => { + // Test something + }) + + it('Test case 2', () => { + // Test something + }) +}) +``` + diff --git a/docs/versioned/2.2.0/img/gate-teamware-logo.svg b/docs/versioned/2.2.0/img/gate-teamware-logo.svg new file mode 100644 index 00000000..12385947 --- /dev/null +++ b/docs/versioned/2.2.0/img/gate-teamware-logo.svg @@ -0,0 +1,79 @@ + + + + diff --git a/docs/versioned/2.2.0/manageradminguide/README.md b/docs/versioned/2.2.0/manageradminguide/README.md new file mode 100644 index 00000000..7a70a19f --- /dev/null +++ b/docs/versioned/2.2.0/manageradminguide/README.md @@ -0,0 +1,45 @@ +# GATE Teamware Overview + +## User roles + +There are three types of users in GATE Teamware, [annotators](#annotators), [managers](#managers) +and [admins](#admins). + +### Annotators + +Annotator is the default role when signing up to Teamware. An annotator can be recruited into +annotation projects and annotate documents. + + +### Managers + +Managers can create, view and modify annotation projects. They can also recruit annotators to a project. + +### Admins + +Admins, on top of what managers can do, they can also manage the users in the system and elevate them as +managers or admins. + +## Annotation Projects, Documents and Annotations + +Projects, documents and annotations form the core of the application. + +### Projects + +An annotation project contains a configuration of how annotations are to be captured, the documents and its +annotations and the recruited annotators. + + +### Documents + +A document in application refers to an individual set of arbitrary text that's to be annotated. A document +is stored as arbitrary JSON object and can represent various things such as, a single post (e.g. a tweet +or a post from reddit), a pair of source post and reply or a part of a HTML web page. + + +### Annotations + +An annotation represents a single annotation task against a single document. Like the document, +an annotation is stored as an arbitrary JSON object and can have any arbitrary structure. + + diff --git a/docs/versioned/2.2.0/manageradminguide/annotators_management.md b/docs/versioned/2.2.0/manageradminguide/annotators_management.md new file mode 100644 index 00000000..cb4c79b0 --- /dev/null +++ b/docs/versioned/2.2.0/manageradminguide/annotators_management.md @@ -0,0 +1,13 @@ +# Annotators management + +The **Annotators** tab in the **Project management** page allows the viewing and management of annotators in the project. + +Add annotators to the project by clicking on the list of names in the right column. Current annotators +can be removed by clicking on the names in the left column. Removing annotators does not delete their +completed annotations but will stop their current pending annotation task. + +An annotator can only be recruited into **one project at a time**. + +Once an annotator has annotated a proportion of documents in the project (specified in project configuration), they will +be deemed to have completed all their annotation tasks and automatically be removed the project. This frees them to be +recruited in another project. diff --git a/docs/versioned/2.2.0/manageradminguide/config_examples.js b/docs/versioned/2.2.0/manageradminguide/config_examples.js new file mode 100644 index 00000000..d6e40454 --- /dev/null +++ b/docs/versioned/2.2.0/manageradminguide/config_examples.js @@ -0,0 +1,332 @@ +export default { + config1: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + } + ], + config2: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + }, + { + "name": "opinion", + "type": "text", + "title": "What's your opinion of the above text?", + "optional": true + } + ], + configDisplay: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + } + ], + configDisplayHtmlNoHtml: [ + { + "name": "htmldisplay", + "type": "html", + "text": "No HTML: {{text}}
HTML: {{{text}}}" + } + ], + configDisplayCustomFieldnames: [ + { + "name": "htmldisplay", + "type": "html", + "text": "Custom field: {{customField}}
Another custom field: {{{anotherCustomField}}}
Subfield: {{{subfield.subfieldContent}}}" + } + ], + configDisplayPreserveNewlines: [ + { + "name": "htmldisplay", + "type": "html", + "text": "
{{text}}
" + } + ], + configTextInput: [ + { + "name": "mylabel", + "type": "text", + "optional": true, //Optional - Set if validation is not required + "regex": "regex string", //Optional - When specified, the regex pattern will used to validate the text + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configTextarea: [ + { + "name": "mylabel", + "type": "textarea", + "optional": true, //Optional - Set if validation is not required + "regex": "regex string", //Optional - When specified, the regex pattern will used to validate the text + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configRadio: [ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "orientation": "vertical", //Optional - default is "horizontal" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configRadioHelpText: [ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "orientation": "vertical", //Optional - default is "horizontal" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1", "helptext": "Additional help text for option 1"}, + {"value": "value2", "label": "Text to show user 2", "helptext": "Additional help text for option 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } + ], + configCheckbox: [ + { + "name": "mylabel", + "type": "checkbox", + "optional": true, //Optional - Set if validation is not required + "orientation": "horizontal", //Optional - "horizontal" (default) or "vertical" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "minSelected": 1, //Optional - Specify the minimum number of options that must be selected + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configSelector: [ + { + "name": "mylabel", + "type": "selector", + "optional": true, //Optional - Set if validation is not required + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + configRadioDict: [ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "options": { // The options can be specified as a dictionary, ordering is not guaranteed + "value1": "Text to show user 1", + "value2": "Text to show user 2", + "value3": "Text to show user 3", + }, + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message then field is validated", //Optional + "valError": "Error message when field fails is validation" //Optional + } + ], + + configDbpediaExample: [ + { + "name": "uri", + "type": "radio", + "title": "Select the most appropriate URI", + "options":[ + {"fromDocument": "candidates"}, + {"value": "none", "label": "None of the above"}, + {"value": "unknown", "label": "Cannot be determined without more context"} + ] + } + ], + docDbpediaExample: { + "text": "President Bush visited the air base yesterday...", + "candidates": [ + { + "value": "http://dbpedia.org/resource/George_W._Bush", + "label": "George W. Bush (Jnr)" + }, + { + "value": "http://dbpedia.org/resource/George_H._W._Bush", + "label": "George H. W. Bush (Snr)" + } + ] + }, + + configConditional1: [ + { + "name": "uri", + "type": "radio", + "title": "Select the most appropriate URI", + "options":[ + {"fromDocument": "candidates"}, + {"value": "other", "label": "Other"} + ] + }, + { + "name": "otherValue", + "type": "text", + "title": "Please specify another value", + "if": "annotation.uri == 'other'", + "regex": "^(https?|urn):", + "valError": "Please specify a URI (starting http:, https: or urn:)" + } + ], + configConditional2: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + }, + { + "name": "reason", + "type": "text", + "title": "Why do you disagree with the suggested value?", + "if": "annotation.sentiment !== document.preanno.sentiment" + } + ], + docsConditional2: [ + { + "text": "I love the thing!", + "preanno": { + "sentiment": "positive" + } + }, + { + "text": "I hate the thing!", + "preanno": { + "sentiment": "negative" + } + }, + { + "text": "The thing is ok, I guess...", + "preanno": { + "sentiment": "neutral" + } + } + ], + + + doc1: {text: "Sometext with html"}, + doc2: { + customField: "Content of custom field.", + anotherCustomField: "Content of another custom field.", + subfield: { + subfieldContent: "Content of a subfield." + } + }, + docPlainText: { + "text": "This is some text\n\nIt has line breaks that we want to preserve." + }, + configPreAnnotation: [ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "radio", + "type": "radio", + "title": "Test radio input", + "options": [ + {"value": "val1", "label": "Value 1"}, + {"value": "val2", "label": "Value 2"}, + {"value": "val3", "label": "Value 4"}, + {"value": "val4", "label": "Value 5"} + ], + "description": "Test radio description" + }, + { + "name": "checkbox", + "type": "checkbox", + "title": "Test checkbox input", + "options": [ + {"value": "val1", "label": "Value 1"}, + {"value": "val2", "label": "Value 2"}, + {"value": "val3", "label": "Value 4"}, + {"value": "val4", "label": "Value 5"} + ], + "description": "Test checkbox description" + }, + { + "name": "text", + "type": "text", + "title": "Test text input", + "description": "Test text description" + } + + ], + docPreAnnotation: { + "id": 12345, + "text": "Example document text", + "preannotation": { + "radio": "val1", + "checkbox": ["val1", "val3"], + "text": "Pre-annotation text value" + } + } + + +} diff --git a/docs/versioned/2.2.0/manageradminguide/documents_annotations_management.md b/docs/versioned/2.2.0/manageradminguide/documents_annotations_management.md new file mode 100644 index 00000000..7b852340 --- /dev/null +++ b/docs/versioned/2.2.0/manageradminguide/documents_annotations_management.md @@ -0,0 +1,296 @@ +# Documents & Annotations + +The **Documents & Annotations** tab in the **Project management** page allows the viewing and management of documents +and annotations related to the project. + +## Document & Annotation status + +### Annotation status + +Annotations can be in 1 of 5 states: + +* Annotation is completed - The annotator has completed this annotation task. +* Annotation is rejected - The annotator has chosen to not annotate the document. +* Annotation is timed out - The annotation task was not completed within the time specified in the project's configuration. The task is freed and can be assigned to another annotator. +* Annotation is aborted - The annotation task was aborted due to reasons other than timing out, such as when an annotator with a pending task is removed from a project. +* Annotation is pending - The annotator has started the annotation task but has not completed it. + +### Document status + +Documents also display a list of its current annotation status: + +* 1 - Number of completed annotations in the document. +* 1 - Number of rejected annotations in the document. +* 1 - Number of timed out annotations in the document. +* 1 - Number of aborted annotations in the document. +* 1 - Number of pending annotations in the document. + +## Importing documents + +Documents can be imported using the **Import** button. The supported file types are: + +* `.json` - The app expects a list of documents (represented as a dictionary object) + e.g. `[{"id": 1, "text": "Text1"}, ...]`. +* `.jsonl` - The app expects one document (represented as a dictionary object) per line. +* `.csv` - File must have a header row. It will be internally converted to JSON format. +* `.zip` - Can contain any number of `.json,.jsonl and .csv` files inside. + +### Importing documents with pre-annotation + +In the `Project Configurations` page, it is possible to set a field in which Teamware will look for pre-annotation. If +the field is found inside the document then the annotation form will be pre-filled with data provided in the document. + +The format for pre-annotation is exactly the same as the annotation output. You can see an example of generated +annotation by filling out the form in the `Annotation Preview` and observing the values in +the `Annotation Output Preview`. + + +For an example project configuration shown below, there are three captured labels named `radio`, `checkbox` and `text`: + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "radio", + "type": "radio", + "title": "Test radio input", + "options": [ + {"value": "val1", "label": "Value 1"}, + {"value": "val2", "label": "Value 2"}, + {"value": "val3", "label": "Value 4"}, + {"value": "val4", "label": "Value 5"} + ], + "description": "Test radio description" + }, + { + "name": "checkbox", + "type": "checkbox", + "title": "Test checkbox input", + "options": [ + {"value": "val1", "label": "Value 1"}, + {"value": "val2", "label": "Value 2"}, + {"value": "val3", "label": "Value 4"}, + {"value": "val4", "label": "Value 5"} + ], + "description": "Test checkbox description" + }, + { + "name": "text", + "type": "text", + "title": "Test text input", + "description": "Test text description" + } +] +``` + +On the `Project Configuration` page, if the `Pre-annotation` field is set to `preannotation`, the annotation form will be pre-filled with +the content provided in the `preannotation` field of the document e.g.: + +```json +{ + "id": 12345, + "text": "Example document text", + "preannotation": { + "radio": "val1", + "checkbox": [ + "val1", + "val3" + ], + "text": "Pre-annotation text value" + } +} +``` + +The example of the pre-filled form can be seen by clicking on the `Preview` tab above. + + + + + + +### Importing Training and Test documents + +When importing documents for the training and testing phase, Teamware expects a field/column (called `gold` by default) +that contains the correct annotation response for each label and, only for training documents, an explanation. + +For example, if we're expecting a multi-choice label for doing sentiment classification with a widget named `sentiment` +and choice of `postive`, `negative` and `neutrual`: + +```js +[ + { + "text": "What's my sentiment", + "gold": { + "sentiment": { + "value": "positive", // For this document, the correct value is postive + "explanation": "Because..." // Explanation is only given in the traiing phase and are optional in the test documents + } + } + } +] +``` + +in csv: + +| text | gold.sentiment.value | gold.sentiment.explanation | +| --- | --- | --- | +| What's my sentiment | positive | Because... | + +### Guidance on CSV column headings + +It is recommended that: + +* Spaces are not used in column headings, use dash (`-`), underscore (`_`) or camel case (e.g. fieldName) instead. +* The dot/full stop (`.`) is used to indicate hierarchical information so don't use it if that's not what's intended. + Explanation on this feature is given below. + +Documents imported from a CSV files are converted to JSON for use internally in Teamware, the reverse is true when +converting back to CSV. To allow a CSV to represent a hierarchical structure, a dot notation is used to indicate a +sub-field. + +In the following example, we can see that `gold` has a child field named `sentiment` which then has a child field +named `value`: + +| text | gold.sentiment.value | gold.sentiment.explanation | +| --- | --- | --- | +| What's my sentiment | positive | Because... | + +The above column headers will generate the following JSON: + +```js +[ + { + "text": "What's my sentiment", + "gold": { + "sentiment": { + "value": "positive", // For this document, the correct value is postive + "explanation": "Because..." // Explanation is only given in the traiing phase and are optional in the test documents + } + } + } +] +``` + +## Exporting documents + +Documents and annotations can be exported using the **Export** button. A zip file is generated containing files with 500 +documents each. The option to "anonymize annotators" controls whether the individual annotators are identified with +their numeric ID or by their actual username - since usernames are often personally identifiable information (e.g. an +email address) the anonumous mode is recommended if you intend to share the annotation data with third parties. Note +that the anonymous IDs are consistent within a single installation of Teamware, so even in anonymous mode it is still +possible to determine which documents were annotated by _the same person_, just not who that person was. + +You can choose how documents are exported: + +* `.json` & `.jsonl` - JSON or JSON Lines files can be generated in the format of: + * `raw` - Exports the original `JSON` combined with an additional field named `annotation_sets` for storing + annotations. The annotations are laid out in the same way as GATE + [bdocjs](https://gatenlp.github.io/gateplugin-Format_Bdoc/bdoc_document.html) format. For example if a document + has been annotated by `user1` with labels and values `text`:`Annotation text`, `radio`:`val3`, and + `checkbox`:`["val2", "val4"]`, the non-anonymous export might look like this: + + ```json + { + "id": 32, + "text": "Document text", + "text2": "Document text 2", + "feature1": "Feature text", + "annotation_sets":{ + "user1":{ + "name":"user1", + "annotations":[ + { + "type":"Document", + "start":0, + "end":10, + "id":0, + "features":{ + "text":"Annotation text", + "radio":"val3", + "checkbox":[ + "val2", + "val4" + ] + } + } + ], + "next_annid":1 + } + }, + "teamware_status": { + "rejected_by": ["user2"], + "timed_out": ["user3"], + "aborted": [] + } + } + ``` + + In anonymous mode the name `user1` would instead be derived from the user's opaque numeric identifier (e.g. + `annotator105`). + + The field `teamware_status` gives the usernames or anonymous IDs (depending on the "anonymize" setting) of those annotators + who rejected the document, "timed out" because they did not complete their annotation in the time allowed by the + project, or "aborted" for some other reason (e.g. they were removed from the project). + + * `gate` - Convert documents to GATE [bdocjs](https://gatenlp.github.io/gateplugin-Format_Bdoc/bdoc_document.html) + format and export. A `name` field is added that takes the `ID` value from the `ID field` specified in the + **project configuration**. Any top-level fields apart from `text`, `features`, `offset_type`, `annotation_sets`, + and the ID field specified in the project config are placed in the `features` field, as is the `teamware_status` + information. An `annotation_sets` field is added for storing annotations if it doesn't already exist. + + For example in the case of this uploaded JSON document: + ```json + { + "id": 32, + "text": "Document text", + "text2": "Document text 2", + "feature1": "Feature text" + } + ``` + The generated output is as follows. The annotations and `teamware_status` are formatted same as the `raw` output + above: + ```json + { + "name": 32, + "text": "Document text", + "features": { + "text2": "Document text 2", + "feature1": "Feature text", + "teamware_status": {...} + }, + "offset_type":"p", + "annotation_sets": {...} + } + ``` +* `.csv` - The JSON documents will be flattened to csv's column based format. Annotations are added as additional + columns with the header of `annotations.username.label` and the status information is in columns named + `teamware_status.rejected_by`, `teamware_status.timed_out` and `teamware_status.aborted`. + +**Note: Documents that contains existing annotations (i.e. the `annotation_sets` field for `JSON` or `annotations` for `CSV`) are merged with the new sets of annotations. Be aware that if the document has a new annotation from an annotator with the same +username, the previous annotation will be overwritten. Existing annotations are also not anonymized when exporting the document.** + +## Deleting documents and annotations + +It is possible to click on the top left of corner of documents and annotations to select it, then click on the +**Delete** button to delete them. + +::: tip + +Selecting a document also selects all its associated annotations. + +::: + + + diff --git a/docs/versioned/2.2.0/manageradminguide/project_config.md b/docs/versioned/2.2.0/manageradminguide/project_config.md new file mode 100644 index 00000000..d3d4ea3d --- /dev/null +++ b/docs/versioned/2.2.0/manageradminguide/project_config.md @@ -0,0 +1,684 @@ +--- +sidebarDepth: 3 +--- + +# Project configuration + +The **Configuration** tab in the **Project management** page allows you to change project settings including what +annotations are captured. + +Project configurations can be imported and exported in the format of a JSON file. + +The project can be also be cloned (have configurations copied to a new project). Note that cloning does not copy +documents, annotations or annotators to the new project. + +## Configuration fields + +* **Name** - The name of this annotation project. +* **Description** - The description of this annotation project that will be shown to annotators. Supports markdown and + HTML. +* **Annotator guideline** - The description of this annotation project that will be shown to annotators. Supports + markdown and HTML. +* **Annotations per document** - The project completes when each document in this annotation project have this many + number of valid annotations. When a project completes, all project annotators will be un-recruited and be allowed to + annotate other projects. +* **Maximum proportion of documents annotated per annotator (between 0 and 1)** - A single annotator cannot annotate + more than this proportion of documents. +* **Timeout for pending annotation tasks (minutes)** - Specify the number of minutes a user has to complete an + annotation task (i.e. annotating a single document). +* **Reject documents** - Switching this off will mean that annotators for this project will be unable to choose to reject documents. +* **Document ID field** - The field in your uploaded documents that is used as a unique identifier. GATE's json format + uses the name field. You can use a dot limited key path to access subfields e.g. enter features.name to get the id + from the object `{'features':{'name':'nameValue'}}` +* **Training stage enable/disable** - Enable or disable training stage, allows testing documents to be uploaded to the project. +* **Test stage enable/disable** - Enable or disable testing stage, allows test documents to be uploaded to the project. +* **Auto elevate to annotator** - The option works in combination with the training and test stage options, see table below for the behaviour: + + | Training stage | Testing stage | Auto elevate to annotator | Desciption | + | --- | --- | --- | --- | + | Disabled | Disabled | Enabled/Disabled | User allowed to annotate without manual approval. | + | Enabled | Disabled | Disabled | Manual approval required. | + | Disabled | Enabled | Disabled | " | + | Enabled | Disabled | Enabled | User always allowed to annotate after training phase completed | + | Disabled | Enabled | Enabled | User automatically allowed to annotate after passing test, if user fails test they have to be manually approved. | + | Enabled | Enabled | Enabled | " | + +* **Test pass proportion** - The proportion of correct test annotations to be automatically allowed to annotate documents. +* **Gold standard field** - The field in document's JSON/column that contains the ideal annotation values and explanation for the annotation. +* **Pre-annotation** - Pre-fill the form with annotation provided in the specified field. See [Importing Documents with pre-annotation](./documents_annotations_management.md#importing-documents-with-pre-annotation) section for more detail. + +## Annotation configuration + +The annotation configuration takes a `json` string for configuring how the document is displayed to the user and types +of annotation will be collected. Here's an example configuration and a preview of how it is shown to annotators: + + + + +```json +// Example configuration +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + } +] +``` + + + +Within the configuration, it is possible to specify how your documents will be displayed. The **Document input preview** +box can be used to provide a sample of your document for rendering of the preview. + +```json +// Example contents for the Document input preview +{ + "text": "Sometext with html" +} +``` + + + +The above configuration displays the value from the `text` field from the document to be annotated. It then shows a set +of 3 radio inputs that allows the user to select a Negative, Neutral, or Positive sentiment with the label +name `sentiment`. + + + +All fields **require** the properties **name** and **type**, it is used to name our label and determine the type of +input/display to be shown to the user respectively. + +Another field can be added to collect more information, e.g. a text field for opinions: + + + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + }, + { + "name": "opinion", + "type": "text", + "title": "What's your opinion of the above text?", + "optional": true + } + +] +``` + + + +Note that for the above case, the `optional` field is added ensure that allows user to not have to input any value. +This `optional` field can be used on all components. Any component may optionally have a field named `if`, containing an expression that is used to determine whether or not the component appears based on information in the document and/or the values entered in the other components. For example the user could be presented with a set of options that includes an "other" choice, and if the annotator chooses "other" then an additional free text field appears for them to fill in. The `if` option is described in more detail under the [conditional components](#conditional-components) section below. + +Some fields are available to configure which are specific to components, e.g. the `options` field are only available for +the `radio`, `checkbox` and `selector` components. See details below on the usage of each specific component. + +The captured annotation results in a JSON dictionary, an example can be seen in the **Annotation output preview** box. +The annotation is linked to a Document and is converted to a GATE JSON annotation format when exported. + +### Displaying text + + + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" // The text that will be displayed + } +] +``` + + + +The `htmldisplay` widget allows you to display the text you want annotated. It accepts almost full range of HTML +input which gives full styling flexibility. + +Any field/column from the document can be inserted by surrounding a field/column name with double or +triple curly brackets. Double curly brackets renders text as-is and triple curly brackets accepts HTML string: + + + +Input: + +```json +{ + "text": "Sometext with html" +} +``` + +Configuration, showing the same field/column in document as-is or as HTML: +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "No HTML: {{text}}
HTML: {{{text}}}" + } +] +``` + +
+ +The widget makes no assumption about your document structure and any field/column names can be used, +even sub-fields by using the dot notation e.g. `parentField.childField`: + + + +JSON input: + +```json +{ + "customField": "Content of custom field.", + "anotherCustomField": "Content of another custom field.", + "subfield": { + "subfieldContent": "Content of a subfield." + } +} +``` + +or in csv + +| customField | anotherCustomField | subfield.subfieldContent | +| --- | --- | --- | +| Content of custom field. | Content of another custom field. | Content of a subfield. | + + +Configuration, showing the same field/column in document as-is or as HTML: +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "Custom field: {{customField}}
Another custom field: {{{anotherCustomField}}}
Subfield: {{{subfield.subfieldContent}}}" + } +] +``` + +
+ +If your documents are plain text and include line breaks that need to be preserved when rendering, this can be achieved by using a special HTML wrapper which sets the [`white-space` CSS property](https://developer.mozilla.org/en-US/docs/Web/CSS/white-space). + + + +**Document** + +```json +{ + "text": "This is some text\n\nIt has line breaks that we want to preserve." +} +``` + +**Project configuration** + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "
{{text}}
" + } +] +``` + +
+ +`white-space: pre-line` preserves line breaks but collapses other whitespace down to a single space, `white-space: pre-wrap` would preserve all whitespace including indentation at the start of a line, but would still wrap lines that are too long for the available space. + +### Text input + + + +```json +[ + { + "name": "mylabel", + "type": "text", + "optional": true, //Optional - Set if validation is not required + "regex": "regex string", //Optional - When specified, the regex pattern will used to validate the text + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Textarea input + + + +```json +[ + { + "name": "mylabel", + "type": "textarea", + "optional": true, //Optional - Set if validation is not required + "regex": "regex string", //Optional - When specified, the regex pattern will used to validate the text + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Radio input + + + +```json +[ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "orientation": "vertical", //Optional - default is "horizontal" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Checkbox input + + + +```json +[ + { + "name": "mylabel", + "type": "checkbox", + "optional": true, //Optional - Set if validation is not required + "orientation": "horizontal", //Optional - "horizontal" (default) or "vertical" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "minSelected": 1, //Optional - Overrides optional field. Specify the minimum number of options that must be selected + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Selector input + + + +```json +[ + { + "name": "mylabel", + "type": "selector", + "optional": true, //Optional - Set if validation is not required + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1"}, + {"value": "value2", "label": "Text to show user 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Optional help text + +Optionally, radio buttons and checkboxes can be given help text to provide additional per-choice context or information to help annotators. + + + + +```json +[ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "orientation": "vertical", //Optional - default is "horizontal" + "options": [ // The options that the user is able to select from + {"value": "value1", "label": "Text to show user 1", "helptext": "Additional help text for option 1"}, + {"value": "value2", "label": "Text to show user 2", "helptext": "Additional help text for option 2"}, + {"value": "value3", "label": "Text to show user 3"} + ], + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Alternative way to provide options for radio, checkbox and selector + +A dictionary (key value pairs) and also be provided to the `options` field of the radio, checkbox and selector widgets +but note that the ordering of the options are **not guaranteed** as javascript does not sort dictionaries by +the order in which keys are added. Note that additional help texts for radio buttons and checkboxes are not supported using this syntax. + + + +```json +[ + { + "name": "mylabel", + "type": "radio", + "optional": true, //Optional - Set if validation is not required + "options": { // The options can be specified as a dictionary, ordering is not guaranteed + "value1": "Text to show user 1", + "value2": "Text to show user 2", + "value3": "Text to show user 3" + }, + "title": "Title string", //Optional + "description": "Description string", //Optional + "valSuccess": "Success message when the field is validated", //Optional + "valError": "Error message when the field fails validation" //Optional + } +] +``` + + + +### Dynamic options for radio, checkbox and selector + +All the examples above have a "static" list of available options for the radio, checkbox and selector widgets, where the complete options list is enumerated in the project configuration and every document offers the same set of options. However it is also possible to take some or all of the options from the _document_ data rather than the _configuration_ data. For example: + + + +**Project configuration** + +```json +[ + { + "name": "uri", + "type": "radio", + "title": "Select the most appropriate URI", + "options":[ + {"fromDocument": "candidates"}, + {"value": "none", "label": "None of the above"}, + {"value": "unknown", "label": "Cannot be determined without more context"} + ] + } +] +``` + +**Document** + +```json +{ + "text": "President Bush visited the air base yesterday...", + "candidates": [ + { + "value": "http://dbpedia.org/resource/George_W._Bush", + "label": "George W. Bush (Jnr)" + }, + { + "value": "http://dbpedia.org/resource/George_H._W._Bush", + "label": "George H. W. Bush (Snr)" + } + ] +} +``` + + + +`"fromDocument"` is a dot-separated property path leading to the location within each document where the additional options can be found, for example `"fromDocument":"candidates"` looks for a top-level property named `candidates` in each document, `"fromDocument": "options.custom"` would look for a property named `options` which is itself an object with a property named `custom`. The target property in the document may be in any of the following forms: + +- an array _of objects_, each with `value` and `label` (and optionally `helptext`) properties, exactly as in the static configuration format - this is the format used in the example above +- an array _of strings_, where the same string will be used as both the value and the label for that option +- an arbitrary ["dictionary"](#options-as-dict) object mapping values to labels +- a _single string_, which is parsed into a list of options + +The "single string" alternative is designed to be easier to use when [importing documents](documents_annotations_management.md#importing-documents) from CSV files. It allows you to provide any number of options in a _single_ CSV column value. Within the column the options are separated by semicolons, and each option is of the form `value=label`. Whitespace around the delimiters is ignored, both between options and between the value and label of a single option. For example given CSV document data of + +| text | options | +|-----------------|---------------------------------------------------| +| Favourite fruit | `apple=Apples; orange = Oranges; kiwi=Kiwi fruit` | + +a `{"fromDocument": "options"}` configuration would produce the equivalent of + +```json +[ + {"value": "apple", "label": "Apples"}, + {"value": "orange", "label": "Oranges"}, + {"value": "kiwi", "label": "Kiwi fruit"} +] +``` + +If your values or labels may need to contain the default separator characters `;` or `=` you can select different separators by adding extra properties to the configuration: + +```json +{"fromDocument": "options", "separator": "~~", "valueLabelSeparator": "::"} +``` + +| text | options | +|-----------------|------------------------------------------------------| +| Favourite fruit | `apple::Apples ~~ orange::Oranges ~~ kiwi::Kiwi fruit` | + +The separators can be more than one character, and you can set `"valueLabelSeparator":""` to disable label splitting altogether and just use the value as its own label. + +### Mixing static and dynamic options + +Static and `fromDocument` options may be freely interspersed in any order, so you can have a fully-dynamic set of options by specifying _only_ a `fromDocument` entry with no static options, or you can have static options that are listed first followed by dynamic options, or dynamic options first followed by static, etc. + +### Conditional components + +By default all components listed in the project configuration will be shown for all documents. However this is not always appropriate, for example you may have some components that are only relevant to certain documents, or only relevant for particular combinations of values in _other_ components. To allow for these kinds of scenarios any component can have a field named `if` specifying the conditions under which that component should be shown. + +The `if` field is an _expression_ that is able to refer to fields in both the current _document_ being annotated and the current state of the other annotation components. The expression language is largely based on a subset of the standard JavaScript expression syntax but with a few additional syntax elements to ease working with array data and regular expressions. + +The following simple example shows how you might implement an "Other (please specify)" pattern, where the user can select from a list of choices but also has the option to supply their own answer if none of the choices are appropriate. The free text field is only shown if the user selects the "other" choice. + + + +**Project configuration** + +```json +[ + { + "name": "uri", + "type": "radio", + "title": "Select the most appropriate URI", + "options":[ + {"fromDocument": "candidates"}, + {"value": "other", "label": "Other"} + ] + }, + { + "name": "otherValue", + "type": "text", + "title": "Please specify another value", + "if": "annotation.uri == 'other'", + "regex": "^(https?|urn):", + "valError": "Please specify a URI (starting http:, https: or urn:)" + } +] +``` + +**Document** + +```json +{ + "text": "President Bush visited the air base yesterday...", + "candidates": [ + { + "value": "http://dbpedia.org/resource/George_W._Bush", + "label": "George W. Bush (Jnr)" + }, + { + "value": "http://dbpedia.org/resource/George_H._W._Bush", + "label": "George H. W. Bush (Snr)" + } + ] +} +``` + + +Note that validation rules (such as `optional`, `minSelected` or `regex`) are not applied to components that are hidden by an `if` expression - hidden components will never be included in the annotation output, even if they would be considered "required" had they been visible. + +Components can also be made conditional on properties of the _document_, or a combination of the document and the annotation values, for example + + + +**Project configuration** + +```json +[ + { + "name": "htmldisplay", + "type": "html", + "text": "{{{text}}}" + }, + { + "name": "sentiment", + "type": "radio", + "title": "Sentiment", + "description": "Please select a sentiment of the text above.", + "options": [ + {"value": "negative", "label": "Negative"}, + {"value": "neutral", "label": "Neutral"}, + {"value": "positive", "label": "Positive"} + ] + }, + { + "name": "reason", + "type": "text", + "title": "Why do you disagree with the suggested value?", + "if": "annotation.sentiment !== document.preanno.sentiment" + } +] +``` + +**Documents** + +```json +[ + { + "text": "I love the thing!", + "preanno": { "sentiment": "positive" } + }, + { + "text": "I hate the thing!", + "preanno": { "sentiment": "negative" } + }, + { + "text": "The thing is ok, I guess...", + "preanno": { "sentiment": "neutral" } + } +] +``` + + + +The full list of supported constructions is as follows: + +- the `annotation` variable refers to the current state of the annotation components for this document + - the current value of a particular component can be accessed as `annotation.componentName` or `annotation['component name']` - the brackets version will always work, the dot version works if the component's `name` is a valid JavaScript identifier + - if a component has not been set since the form was last cleared the value may be `null` or `undefined` - the expression should be written to cope with both + - the value of a `text`, `textarea`, `radio` or `selector` component will be a single string (or null/undefined), the value of a `checkbox` component will be an _array_ of strings since more than one value may be selected. If no value is selected the array may be null, undefined or empty, the expression must be prepared to handle any of these +- the `document` variable refers to the current document that is being annotated + - again properties of the document can be accessed as `document.propertyName` or `document['property name']` + - continue the same pattern for nested properties e.g. `document.scores.label1` + - individual elements of array properties can be accessed by zero-based index (e.g. `document.options[0]`) +- various comparison operators are available: + - `==` and `!=` (equal and not-equal) + - `<`, `<=`, `>=`, `>` (less-than, less-or-equal, greater-or-equal, greater-than) + - these operators follow JavaScript rules, which are not always intuitive. Generally if both arguments are strings then they will be compared by lexicographic order, but if either argument is a number then the other one will also be converted to a number before comparing. So if the `score` component is set to the value "10" (a string of two digits) then `annotation.score < 5` would be _false_ (10 is converted to number and compared to 5) but `annotation.score < '5'` would be _true_ (the string "10" sorts before the string "5") + - `in` checks for the presence of an item in an array or a key in an object + - e.g. `'other' in annotation.someCheckbox` checks if the `other` option has been ticked in a checkbox component (whose value is an array) + - this is different from normal JavaScript rules, where `i in myArray` checks for the presence of an array _index_ rather than an array _item_ +- other operators + - `+` (concatenate strings, or add numbers) + - if either argument is a string then both sides are converted to strings and concatenated together + - otherwise both sides are treated as numbers and added + - `-`, `*`, `/`, `%` (subtraction, multiplication, division and remainder) + - `&&`, `||` (boolean AND and OR) + - `!` (prefix boolean NOT, e.g. `!annotation.selected` is true if `selected` is false/null/undefined and false otherwise) + - conditional operator `expr ? valueIfTrue : valueIfFalse` (exactly as in JavaScript, first evaluates the test `expr`, then either the `valueIfTrue` or `valueIfFalse` depending on the outcome of the test) +- `value =~ /regex/` tests whether the given string value contains any matches for the given [regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_expressions#writing_a_regular_expression_pattern) + - use `^` and/or `$` to anchor the match to the start and/or end of the value, for example `annotation.example =~ /^a/i` checks whether the `example` annotation value _starts with_ "a" or "A" (the `/i` flag makes the expression case-insensitive) + - since the project configuration is entered as JSON, any backslash characters within the regex must be doubled to escape them from the JSON parser, i.e. `"if": "annotation.option =~ /\\s/"` would check if `option` contains any space characters (for which the regular expression literal is `/\s/`) +- _Quantifier_ expressions let you check whether `any` or `all` of the items in an array or key/value pairs in an object match a predicate expression. The general form is `any(x in expr, predicate)` or `all(x in expr, predicate)`, where `expr` is an expression that resolves to an array or object value, `x` is a new identifier, and `predicate` is the expression to test each item against. The `predicate` expression can refer to the `x` identifier + - `any(option in annotation.someCheckbox, option > 3)` + - `all(e in document.scores, e.value < 0.7)` (assuming `scores` is an object mapping labels to scores, e.g. `{"scores": {"positive": 0.5, "negative": 0.3}}`) + - when testing a predicate against an _object_ each entry has `.key` and `.value` properties giving the key and value of the current entry + - on a null, undefined or empty array/object, `any` will return _false_ (since there are no items that pass the test) and `all` will return _true_ (since there are no items that _fail_ the test) + - the predicate is optional - `any(arrayExpression)` resolves to `true` if any item in the array has a value that JavaScript considers to be "truthy", i.e. anything other than the number 0, the empty string, null or undefined. So `any(annotation.myCheckbox)` is a convenient way to check whether _at least one_ option has been selected in a `checkbox` component. + +If the `if` expression for a particular component is _syntactically invalid_ (missing operands, mis-matched brackets, etc.) then the condition will be ignored and the component will always be displayed as though it did not have an `if` expression at all. Conversely, if the expression is valid but an error occurs while _evaluating_ it, this will be treated the same as if the expression returned `false`, and the associated component will not be displayed. The behaviour is this way around as the most common reason for errors during evaluation is attempting to refer to annotation components that have not yet been filled in - if this is not appropriate in your use case you must account for the possibility within your expression. For example, suppose `confidence` is a `radio` or `selector` component with values ranging from 1 to 5, then another component that declares + +``` +"if": "annotation.confidence && annotation.confidence < 4"` +``` + +will hide this component if `confidence` is unset, displaying it only if `confidence` is set to a value less than 4, whereas + +``` +"if": "!annotation.confidence || annotation.confidence < 4" +``` + +will hide this component only if `confidence` is actually _set_ to a value of 4 or greater - it will _show_ this component if `confidence` is unset. Either approach may be correct depending on your project's requirements. + +To assist managers in authoring project configurations with `if` conditions, the "preview" mode on the project configuration page will display details of any errors that occur when parsing the expressions, or when evaluating them against the **Document input preview** data. You are encouraged to test your expressions thoroughly against a variety of inputs to ensure they behave as intended, before opening your project to annotators. + + diff --git a/docs/versioned/2.2.0/manageradminguide/project_management.md b/docs/versioned/2.2.0/manageradminguide/project_management.md new file mode 100644 index 00000000..f1fd0cfe --- /dev/null +++ b/docs/versioned/2.2.0/manageradminguide/project_management.md @@ -0,0 +1,38 @@ +# Annotation Project Management + +## Project Listing +Clicking on the `Projects` link in the top navigation bar takes you to a contains a list of existing +projects. The project names are shown along with their summaries. Clicking on a project name will +take you to the project management page. + + +## Project Management Page + +The project management page contains all the functionalities to manage an annotation project. The page +is composed of three main tabs: + +* [Configuration](project_config.md) - Configure project settings including what annotations are captured. +* [Documents & Annotation](documents_annotations_management.md) - Manage documents and annotations. Upload documents, see contents of a document's annotations and import/export documents. +* [Annotators](annotators_management.md) - Manage the recruitment of annotators. + +::: warning + +Annotators can only be recruited to an annotation project after it has been configured and documents +are uploaded to the project. + +::: + + +## Project status icons +In the **Project listing** and **Project management page**, icon badges are used to provide a quick overview of the project's status: + +* 1 - Number of completed annotations in the project. +* 1 - Number of rejected annotations in the project. +* 1 - Number of timed out annotations in the project. +* 1 - Number of aborted annotations in the project. +* 1 - Number of pending annotations in the project. +* 2/60 - Number of occupied annotation tasks over number of total tasks in the project. +* 20/5/10 - Number of documents, training documents and test documents in the project. +* 1 - Number of annotators recruited in the project. Annotators are removed from the project when they have completed all annotation tasks in their quota. + + diff --git a/package.json b/package.json index 590f2cf0..940daf3b 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gate-teamware", - "version": "2.1.1", + "version": "2.2.0", "description": "A service for collaborative document annotation.", "main": "index.js", "scripts": {