-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* WIP layout updates * recent changes * copy * small doc updates * copy change * Release 0.7.0 * Fix encoding to utf8 in label config and completions. (#300) * Pre release 0.7.0 fixes (#302) * fix template, add exception treatment * update converter * fix error handling on export Co-authored-by: nik <[email protected]> * Feature/cloudstorage (#299) * fix context * fix multi session context * source/target storages, s3 support * fix predictions * uri resolver & simple background jobs * fix None prefix * set daemon thread * add storages forms * add gcs support * register gcs storage * Tasks pagination (#298) * Task pagination. * Some. * Fixes. * Fix. * Fixes. * update requirements * fix base/s3 storages * fix s3 * fix FormMeta * signed urls for gs * hide manage buttons on GCS * Fixes. * Storage settings endpoint. * api storage settings update * More. * Fixes * fix typo * get available storage, move api settings * fix can manage completions * Some. * Some. * Fix. * Fix. * filesystem bucket, fix sync * fix exception' * Some. * Errors in UI. * changing the URL * get/post api for forms * Fixes. * handle target storage * form field autobound * Some. * Some. * build form with request.json * fix redundant imports * Some. * Some. * Fixes. * fix blobs, purify code * prepend storage class name * comment error handling on get * Some. * Some. * Before cache. * Stable. * load next task * fix update after regex changed * fix keys * tasks json & completions storages * separate source/target storage names * make dict names * validate cloud storage connections * Fixes. * extend filters * validate connection on sync * fix target filters * Common error print. * fix path for BaseForm * validate storage before create * Some. * Project dict to tasks template. * Some. * Some. * dont remove unexisted completion * fix can manage tasks permissions * Some. * can delete tasks flag * draft doc * doc params fix * add error handler for get_value within thread * Some. * add logging, can delete tasks, don't return dict on error * Some. * deepcopy tasks on saving completion * Some. * Some. * Some. * Some. * Fix. * Timer to 5 sec. * Fix. * includes fix. * polyfill. * polyfill.js * ie * small changes to the layout and copy * fix invalid completions, reduce forms, add blob url as parameter * add local copy option target storage,fix completion list * Some. * no local copy for source storage * Some. * some updates for words * before remove select in storages. * fix init old projects * fix old config load on start * Some. * dont output creation times when created_at is absent * Some. * Some. * Some. * toast in errors. * fix docs commands * Fixes. * fix perms * Pretty. * Version check. Co-authored-by: nik <[email protected]> Co-authored-by: Max <[email protected]> Co-authored-by: Mikhail Maluyk <[email protected]> * Encoding fixes. * New LS build. * Fixes with welcome page. * 0.7.0rc1 * images, blog, copy * doc updates * Update README.md * fix create local copy,enhance cmd docstrings,skip syncing on empty regex * v0.7.0rc2 * some UI & doc fixes * add storage.is_syncing flag * Fixes. * Error treatment on all pages. * Paths in storages. * copy_local checkbox hide. * supported formats fix * Storage sync in progress. * Default data_key in BaseStorage. * is_syncing for BaseStorage. * use_blob_urls=True as default. * completed_at to undefined if not defined. * fix force option while running docker * v0.7.0 Co-authored-by: Mikhail Maluyk <[email protected]> Co-authored-by: niklub <[email protected]> Co-authored-by: nik <[email protected]>
- Loading branch information
1 parent
64b6cc8
commit aa417df
Showing
63 changed files
with
3,683 additions
and
841 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
--- | ||
title: Label Studio Release Notes 0.7.0 - Cloud Storage Enablement | ||
type: blog | ||
order: 100 | ||
--- | ||
|
||
Just a couple of weeks after our 0.6.0 release, we’re happy to announce a new big release. We’ve started the discussion about the cloud months ago, and as the first step in simplifying the integration, we’re happy to introduce cloud storage connectors, like AWS S3. | ||
|
||
We’re also very interested to learn more from you about your ML pipelines, if you’re interested in having a conversation, please ping us on [Slack](https://join.slack.com/t/label-studio/shared_invite/zt-cr8b7ygm-6L45z7biEBw4HXa5A2b5pw). | ||
|
||
<br/> | ||
<img src="/images/release-070/s3-mascot-04.png" /> | ||
|
||
## Connecting cloud storage | ||
|
||
You can configure label studio to synchronize labeling tasks with your s3 or gcp bucket, potentially filtering by a specific prefix or a file extension. Label Studio will take that list and generate pre-signed URLs each time the task is shown to the annotator. | ||
|
||
<br/> | ||
<img src="/images/release-070/configure-s3.gif" class="gif-border" /> | ||
|
||
There are several ways how label studio can load the file, either as a URL or as a blob therefore, you can store the list of tasks or the assets themselves and load that. | ||
|
||
<br/> | ||
<img src="/images/release-070/s3-config.png" class="gif-border" /> | ||
|
||
You can configure it to store the results back to s3/gcp, making Label Studio a part of your data processing pipeline. Read more about the configuration in the docs [here](/guide/storage.html). | ||
|
||
## Frontend package updates | ||
|
||
Finally with a lot of [work](https://github.com/heartexlabs/label-studio-frontend/pull/75) from [Andrew](https://github.com/hlomzik) there is an implementation of frontend testing. This will make sure that we don’t break things when we introduce new features. Along with that another Important part — improved building and publishing process, configured CI. Now the npm frontend package will be published along with the pip package. | ||
|
||
## Labeling Paragraphs and Dialogues | ||
|
||
Introducing a new object tag called “Paragraphs”. A paragraph is a piece of text with potentially additional metadata like the author and the timestamp. With this tag we’re also experimenting now with an idea of providing predefined layouts. For example to label the dialogue you can use the following config: `<Paragraphs name=“conversation” value=“$conv” layout=“dialogue” />` | ||
|
||
<br/> | ||
<img src="/images/release-070/dialogues.png" class="gif-border" /> | ||
|
||
This feature is available in the [enterprise version](https://heartex.ai/) only | ||
|
||
## Different shapes on the same image | ||
|
||
One limitation label studio had was the ability to use only one shape on the same image, for example, you were able to put either bounding boxes or polygons. Now this limitation is waived and you can define different label groups and connect those to the same image. | ||
|
||
<br/> | ||
<img src="/images/release-070/multiple-tools.gif" class="gif-border" /> | ||
|
||
## maxUsages | ||
|
||
There are a couple of ways how you can make sure that the annotation is being performed in full. One of these concepts is a `required` flag, and we’ve created a new one called `maxUsages`. For some datasets you know how much objects of a particular type there is, therefore you can limit the usage of specific labels. | ||
|
||
## Bugfixes and Enhancements | ||
- Allow different types of shapes to be used in the same image. For example you can label the same image using both rectangles and ellipses. | ||
- Fixing double text deserialization https://github.com/heartexlabs/label-studio-frontend/pull/85 | ||
- Fix bug with groups of required choices https://github.com/heartexlabs/label-studio-frontend/pull/74 | ||
- Several fixes for NER labeling — empty captured text, double clicks, labels appearance |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
--- | ||
title: Cloud storages | ||
type: guide | ||
order: 101 | ||
--- | ||
|
||
You can integrate the popular cloud storage with Label Studio, collect new tasks uploaded to your buckets, and sync back annotation results to use them in your machine learning pipelines. | ||
|
||
Cloud storage type and bucket need to be configured during the start of the server, and further configured during the runtime via UI. | ||
|
||
You can configure one or both: | ||
|
||
- _source storage_ (where tasks are stored) | ||
- _target storage_ (where completions are stored) | ||
|
||
The connection to both storages is synced, so you can see new tasks after uploading them to the bucket without restarting Label Studio. | ||
|
||
The parameters like prefix or matching filename regex could be changed any time from the webapp interface. | ||
|
||
## Amazon S3 | ||
|
||
To connect your [S3](https://aws.amazon.com/s3) bucket with Label Studio, be sure you have programmatic access enabled. [Check this link](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#configuration) to learn more how to set up access to your S3 bucket. | ||
|
||
### Create connection on startup | ||
|
||
The following commands launch Label Studio, configure the connection to your S3 bucket, scan for existing tasks, and load them into the labeling app. | ||
|
||
#### Read bucket with JSON-formatted tasks | ||
|
||
```bash | ||
label-studio start --init --source s3 --source-path my-s3-bucket | ||
``` | ||
|
||
|
||
#### Write completions to bucket | ||
|
||
```bash | ||
label-studio start --init --target s3-completions --target-path my-s3-bucket | ||
``` | ||
|
||
### Working with Binary Large OBjects (BLOBs) | ||
|
||
When you are storing BLOBs in your S3 bucket (like images or audio files), you might want to use then as is, by generating URLs pointing to those objects (e.g. `gs://my-s3-bucket/image.jpg`) | ||
Label Studio allows you to generate input tasks with corresponding URLs automatically on-the-fly. You can to this either specifying `--source-params` when launching app: | ||
|
||
```bash | ||
label-studio start --init --source s3 --source-path my-s3-bucket --source-params "{\"data_key\": \"my-object-tag-$value\", \"use_blob_urls\": true}" | ||
``` | ||
|
||
You can leave `"data_key"` empty (or skip it at all) then LS generates it automatically with the first task key from label config (it's useful when you have only one object tag exposed). | ||
|
||
|
||
### Optional parameters | ||
|
||
You can specify additional parameters with the command line escaped JSON string via `--source-params` / `--target-params` or from UI. | ||
|
||
#### prefix | ||
|
||
Bucket prefix (typically used to specify internal folder/container) | ||
|
||
#### regex | ||
|
||
A regular expression for filtering bucket objects | ||
|
||
#### create_local_copy | ||
|
||
If set true, the local copy of the remote storage will be created. | ||
|
||
#### use_blob_urls | ||
|
||
Generate task data with URLs pointed to your bucket objects(for resources like jpg, mp3, etc). If not selected, bucket objects will be interpreted as tasks in Label Studio JSON format, one object per task. | ||
|
||
|
||
## Google Cloud Storage | ||
|
||
To connect your [GCS](https://cloud.google.com/storage) bucket with Label Studio, be sure you have enabled programmatic access. [Check this link](https://cloud.google.com/storage/docs/reference/libraries) to learn more about how to set up access to your GCS bucket. | ||
|
||
|
||
### Create connection on startup | ||
|
||
The following commands launch Label Studio, configure the connection to your GCS bucket, scan for existing tasks, and load them into the app for the labeling. | ||
|
||
#### Read bucket with JSON-formatted tasks | ||
|
||
```bash | ||
label-studio start --init --source gcs --source-path my-gcs-bucket | ||
``` | ||
|
||
#### Write completions to bucket | ||
|
||
```bash | ||
label-studio start --init --target gcs-completions --source-path my-gcs-bucket | ||
``` | ||
|
||
### Working with Binary Large OBjects (BLOBs) | ||
|
||
When you are storing BLOBs in your GCS bucket (like images or audio files), you might want to use then as is, by generating URLs pointing to those objects (e.g. `gs://my-gcs-bucket/image.jpg`) | ||
Label Studio allows you to generate input tasks with corresponding URLs automatically on-the-fly. You can to this either specifying `--source-params` when launching app: | ||
|
||
```bash | ||
label-studio start --init --source gcs --source-path my-gcs-bucket --source-params "{\"data_key\": \"my-object-tag-$value\", \"use_blob_urls\": true}" | ||
``` | ||
|
||
You can leave `"data_key"` empty (or skip it at all) then LS generates it automatically with the first task key from label config (it's useful when you have only one object tag exposed). | ||
|
||
|
||
### Optional parameters | ||
|
||
You can specify additional parameters with the command line escaped JSON string via `--source-params` / `--target-params` or from UI. | ||
|
||
#### prefix | ||
|
||
Bucket prefix (typically used to specify internal folder/container) | ||
|
||
#### regex | ||
|
||
A regular expression for filtering bucket objects | ||
|
||
#### create_local_copy | ||
|
||
If set true, the local copy of the remote storage will be created. | ||
|
||
#### use_blob_urls | ||
|
||
Generate task data with URLs pointed to your bucket objects(for resources like jpg, mp3, etc). If not selected, bucket objects will be interpreted as tasks in Label Studio JSON format, one object per task. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,16 @@ | ||
<div id="header"> | ||
<div class="header"> | ||
<a id="logo" href="<%- url_for("/") %>"> | ||
<!-- <img src="<%- url_for("/images/opossum/heartex_icon_opossum_green.svg") %>" alt="label studio logo" --> | ||
<!-- height="180"/> --> | ||
|
||
<img src="<%- url_for("/images/ls_logo.png") %>" alt="label studio logo" /> | ||
<span style="font-size: 1.2em;">Label Studio</span> | ||
</a> | ||
<ul id="nav" style="display: flex; align-items: center"> | ||
<%- partial('partials/main_menu', { context: 'nav' }) %> | ||
</ul> | ||
</div> | ||
</div> | ||
|
||
|
Oops, something went wrong.