Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2.2.0 #410

Merged
merged 28 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
a71d6ae
Upgrade a number of dependencies
ianroberts Jan 21, 2024
9c0491e
Vitest upgrade changes coverage plugin from "-c8" to "-v8"
ianroberts Jan 21, 2024
412d1ed
See if adding a couple of seconds delay gives the doc format preferen…
ianroberts Jan 21, 2024
45b26e2
Put the waits either side of clicking "CSV"
ianroberts Jan 21, 2024
0e64b88
Merge pull request #397 from GateNLP/dependency-upgrades
ianroberts Jan 21, 2024
641eeaf
Include rejected/aborted/timed out annotations in export
ianroberts Feb 23, 2024
e915ee5
Copy the self.data dict when generating exports
ianroberts Feb 23, 2024
41fd491
Documentation for the teamware_status information
ianroberts Feb 23, 2024
834af01
Merge pull request #399 from GateNLP/export-rejected
ianroberts Feb 26, 2024
0cfed45
Resolves #345 fixed username anonymization
twinkarma May 25, 2023
d91a554
Use the same ANONYMIZATION_PREFIX in teamware_status section
ianroberts Feb 26, 2024
f650d87
#346 Prevent double nesting of features field
twinkarma May 25, 2023
b94e22d
#348 merge existing and new annotation fields
twinkarma May 26, 2023
d34021d
Fixed implementation of export tests
twinkarma May 26, 2023
e289227
Updated docs on how the documents and annotation are now exported
twinkarma May 26, 2023
6993266
Merge pull request #377 from GateNLP/various-export-issues
ianroberts Feb 26, 2024
53b9a56
Allow an explicit "none" for no email security rather than just relyi…
ianroberts Mar 11, 2024
ce57783
Make project search case insensitive
freddyheppell Mar 13, 2024
b67e615
Merge pull request #406 from GateNLP/case-insensitive-project-search
ianroberts Mar 13, 2024
24b025b
Make "search by username or email" on user admin page case insensitive
ianroberts Mar 13, 2024
4585c71
Merge pull request #407 from GateNLP/user-search-case-insensitive
ianroberts Mar 13, 2024
fdafb21
Merge pull request #402 from GateNLP/smtp-no-tls
ianroberts May 8, 2024
5d8dcb2
Version number update to 2.2.0
ianroberts Feb 26, 2024
d073786
CHANGELOG section for v2.2.0
ianroberts Feb 26, 2024
1546530
Versioned docs for 2.1.1
ianroberts Feb 26, 2024
96f2886
Documentation for the new email security setting
ianroberts May 8, 2024
64d7e12
Versioned documentation for 2.2.0
ianroberts May 8, 2024
fe1790d
Merge pull request #409 from GateNLP/release-2.2.0
ianroberts May 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,43 @@ docker compose run --rm -it pgbackups /backup.sh

(or `docker-compose` if your version of Docker does not support compose v2).

## [2.2.0] 2024-05-08

### Changed
- **Breaking change**: When exporting annotations as JSON, the "features" that the annotator entered are no longer nested under `label` ([#347](https://github.com/GateNLP/gate-teamware/issues/347)). Where previously the export would have been
```json
{
"features": {
"label": {
"field1": "value1"
}
}
}
```

it is now
```json
{
"features": {
"field1": "value1"
}
}
```
- Include details of failed annotations in export formats ([#399](https://github.com/GateNLP/gate-teamware/pull/399))
- When exporting annotation data from projects (both via the web UI and using the command line tool),
each document includes details of which users _rejected_, _timed out_ or _aborted_ annotation of
that document, as well as the annotation data from the users who completed the document successfully.
This can be useful for the project manager to identify documents that are particularly difficult
to annotate, perhaps suggesting that the annotation guidelines need to be extended or clarified.

### Fixed
- Upgraded a number of third-party dependencies to close various vulnerabilities ([#397](https://github.com/GateNLP/gate-teamware/pull/397))
- Fixed several issues relating to the export of annotated data ([#377](https://github.com/GateNLP/gate-teamware/pull/377))
- "Anonymous" export was not properly anonymous ([#345](https://github.com/GateNLP/gate-teamware/issues/345))
- Teamware now does a better job of preserving the GATE BDOC JSON structure when exporting documents that were originally uploaded in that format ([#346](https://github.com/GateNLP/gate-teamware/issues/346), [#348](https://github.com/GateNLP/gate-teamware/issues/348))
- Added an explicit setting for "no email security", as an alternative to the implicit setting when the relevant environment variable is omitted. This is because the implicit setting was lost on upgrades, whereas an explicit "none" will be preserved ([#402](https://github.com/GateNLP/gate-teamware/pull/402))


## [2.1.1] 2023-10-02

### Added
Expand Down
26 changes: 14 additions & 12 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
abstract: A web application for collaborative document annotation. GATE teamware provides
a flexible web app platform for managing classification of documents by human annotators.
authors:
authors:
- affiliation: The University of Sheffield
email: [email protected]
family-names: Karmakharm
Expand Down Expand Up @@ -33,13 +33,7 @@ keywords:
- document annotation
license: AGPL-3.0
message: If you use this software, please cite it using the metadata from this file.
repository-code: https://github.com/GateNLP/gate-teamware
title: GATE Teamware
type: software
url: https://gatenlp.github.io/gate-teamware/
version: 2.1.1
preferred-citation:
type: conference-paper
authors:
- affiliation: The University of Sheffield
email: [email protected]
Expand All @@ -66,14 +60,22 @@ preferred-citation:
family-names: Bontcheva
given-names: Kalina
orcid: https://orcid.org/0000-0001-6152-9600
collection-title: 'Proceedings of the 17th Conference of the European Chapter of
the Association for Computational Linguistics: System Demonstrations'
doi: 10.18653/v1/2023.eacl-demo.17
title: "GATE Teamware 2: An open-source tool for collaborative document classification annotation"
collection-title: "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations"
end: 151
location:
name: Dubrovnik, Croatia
year: 2023
month: 5
start: 145
end: 151
publisher:
name: Association for Computational Linguistics
start: 145
title: 'GATE Teamware 2: An open-source tool for collaborative document classification
annotation'
type: conference-paper
year: 2023
repository-code: https://github.com/GateNLP/gate-teamware
title: GATE Teamware
type: software
url: https://gatenlp.github.io/gate-teamware/
version: 2.2.0
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.1.1
2.2.0
51 changes: 38 additions & 13 deletions backend/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -978,25 +978,28 @@ def get_doc_annotation_dict(self, json_format="raw", anonymize=True):
# Create dictionary for document
doc_dict = None
if json_format == "raw" or json_format == "csv":
doc_dict = self.data
doc_dict = self.data.copy()
elif json_format == "gate":
# GATE json format are expected to have an existing "features" field
features_dict = dict(self.data["features"]) if "features" in self.data and isinstance(self.data["features"], dict) else {}

ignore_keys = {"text", self.project.document_id_field}
features_dict = {key: value for key, value in self.data.items() if key not in ignore_keys}
# Add any non-compliant top-level fields into the "features" field instead
ignore_keys = {"text", "features", "offset_type", "annotation_sets", self.project.document_id_field}
features_dict.update({key: value for key, value in self.data.items() if key not in ignore_keys})

doc_dict = {
"text": self.data["text"],
"features": features_dict,
"offset_type": "p",
"offset_type": self.data["offset_type"] if "offset_type" in self.data else "p", # Use original offset type
"name": get_value_from_key_path(self.data, self.project.document_id_field)
}
pass

# Insert annotation sets into the doc dict
annotations = self.annotations.filter(status=Annotation.COMPLETED)
if json_format == "csv":
# Gets pre-existing annotations
annotation_sets = dict(self.data["annotations"]) if "annotations" in self.data else {}
# Format annotations for CSV export
annotation_sets = {}
for annotation in annotations:
a_data = annotation.data
annotation_dict = {}
Expand All @@ -1009,36 +1012,58 @@ def get_doc_annotation_dict(self, json_format="raw", anonymize=True):
annotation_dict["duration_seconds"] = annotation.time_to_complete

if anonymize:
annotation_sets[str(annotation.user.id)] = annotation_dict
annotation_sets[f"{settings.ANONYMIZATION_PREFIX}{annotation.user.id}"] = annotation_dict
else:
annotation_sets[annotation.user.username] = annotation_dict

doc_dict["annotations"] = annotation_sets

else:
# Gets pre-existing annotations
annotation_sets = dict(self.data["annotation_sets"]) if "annotation_sets" in self.data else {}
# Format for JSON in line with GATE formatting
annotation_sets = {}
for annotation in annotations:
a_data = annotation.data
anonymized_name = f"{settings.ANONYMIZATION_PREFIX}{annotation.user.id}"
annotation_set = {
"name": annotation.user.id if anonymize else annotation.user.username,
"name": anonymized_name if anonymize else annotation.user.username,
"annotations": [
{
"type": "Document",
"start": 0,
"end": 0,
"id": 0,
"duration_seconds": annotation.time_to_complete,
"features": {
"label": a_data
}
"features": a_data
}
],
"next_annid": 1,
}
annotation_sets[annotation.user.username] = annotation_set
annotation_sets[anonymized_name if anonymize else annotation.user.username] = annotation_set

doc_dict["annotation_sets"] = annotation_sets

# Add to the export the lists (possibly empty) of users who rejected,
# timed out or aborted annotation of this document
teamware_status = {}
for key, status in [
("rejected_by", Annotation.REJECTED),
("timed_out", Annotation.TIMED_OUT),
("aborted", Annotation.ABORTED),
]:
teamware_status[key] = [
f"{settings.ANONYMIZATION_PREFIX}{annotation.user.id}" if anonymize else annotation.user.username
for annotation in self.annotations.filter(status=status)
]
if json_format == "csv":
# Flatten list if exporting as CSV
teamware_status[key] = ",".join(str(val) for val in teamware_status[key])

if json_format == "gate":
doc_dict["features"]["teamware_status"] = teamware_status
else:
doc_dict["teamware_status"] = teamware_status

return doc_dict


Expand Down
2 changes: 1 addition & 1 deletion backend/rpc.py
Original file line number Diff line number Diff line change
Expand Up @@ -510,7 +510,7 @@ def get_projects(request, current_page=1, page_size=None, filters=None):
# Perform filtering
if isinstance(filters, str):
# Search project title if is filter is a string only
projects_query = Project.objects.filter(name__contains=filters.strip())
projects_query = Project.objects.filter(name__icontains=filters.strip())
total_count = projects_query.count()
else:
projects_query = Project.objects.all()
Expand Down
Loading
Loading