Skip to content

Commit

Permalink
Release 0.6.0 post fixes (#290)
Browse files Browse the repository at this point in the history
* some fixes on ml

* modify dockerfile

* adding Procfile

* Fix heroku deployment

* fixes

* fix docker compose

* change default port/host

* rollback readme

* add web page classification example

* add bin bash to fix gcloud deploy

* add cite

* remove task button

* disable serving local files by default

* add prediction score sampling

* some docs refinements

* fix anaconda compat & add docs

* fixed conll export

* audio overlay fix

* fix ui task deletion

* update js/css scripts

* small readme fixes

* rc0

* update conda installation readme

* change setup

Co-authored-by: nik <[email protected]>
  • Loading branch information
niklub and nik authored May 18, 2020
1 parent 9661db8 commit 64b6cc8
Show file tree
Hide file tree
Showing 32 changed files with 346 additions and 125 deletions.
8 changes: 5 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@ COPY requirements.txt /label-studio
RUN pip install -r requirements.txt

ENV PORT="8080"
ENV collect_analytics=0
ENV PROJECT_NAME=my_project

EXPOSE ${PORT}

COPY . /label-studio

RUN pip install -e .
CMD ["label-studio", "start", "my_project", "--init", "--no-browser", "--port", "8080"]
RUN python setup.py develop

CMD ["./tools/run.sh"]
3 changes: 2 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@ recursive-include label_studio/static *
include label_studio/templates/*.html
include label_studio/utils/schema/*.json
include label_studio/logger.json
include label_studio/config.json
include label_studio/config.json
include label_studio/ml/default_configs/*
66 changes: 38 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,20 @@ pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl
pip install label-studio
```

#### Install from Anaconda

```bash
conda create --name label-studio python=3.8
conda activate label-studio
pip install label-studio
```

If you see any errors during installation, try to rerun installation

```bash
pip install --ignore-installed label-studio
```

#### Local development
Running the latest Label Studio version locally without installing package from pip could be done by:
```bash
Expand All @@ -75,7 +89,7 @@ python label-studio/server.py start labeling_project --init
## Run docker
You can also start serving at `http://localhost:8080` by using docker:
```bash
docker run --rm -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest label-studio start my_project --init --host 0.0.0.0
docker run --rm -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest label-studio start my_project --init
```

By default, it starts blank project in `./my_project` directory.
Expand All @@ -85,7 +99,7 @@ By default, it starts blank project in `./my_project` directory.
You can override the default startup command by appending:

```bash
docker run -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest label-studio start my_project --init --force --template image_mixedlabel --host 0.0.0.0
docker run -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest label-studio start my_project --init --force --template text_classification
```

If you want to build a local image, run:
Expand Down Expand Up @@ -161,37 +175,17 @@ The list of supported use cases for data annotation. Please contribute your own

## Machine Learning Integration

You can easily connect your favorite machine learning framework with Label Studio by using [Heartex SDK](https://github.com/heartexlabs/pyheartex).
You can easily connect your favorite machine learning framework with Label Studio Machine Learning SDK. It's done in the simple 2 steps:
1. Start your own ML backend server ([check here for detailed instructions](label_studio/ml/README.md)),
2. Connect Label Studio to the running ML backend on [/model](http://localhost:8080/model.html) page

That gives you the opportunities to use:
- **Pre-labeling**: Use model predictions for pre-labeling
- **Pre-labeling**: Use model predictions for pre-labeling (e.g. make use on-the-fly model predictions for creating rough image segmentations for further manual refinements)
- **Autolabeling**: Create automatic annotations
- **Online Learning**: Simultaneously update (retrain) your model while new annotations are coming
- **Active Learning**: Perform labeling in active learning mode
- **Active Learning**: Perform labeling in active learning mode - select only most complex examples
- **Prediction Service**: Instantly create running production-ready prediction service

There is a quick example tutorial on how to do that with simple image classification:

1. Clone pyheartex, and start serving example image classifier ML backend at `http://localhost:9090`
```bash
git clone https://github.com/heartexlabs/pyheartex.git
cd pyheartex/examples/docker
docker-compose up -d
```

2. Run Label Studio project specifying ML backend URLs:

```bash
label-studio start imgcls --init --template image_classification \
--ml-backend-url http://localhost:9090 --ml-backend-name my_model
```

Once you're satisfied with pre-labeling results, you can immediately send prediction requests via REST API:
```bash
curl -X POST -H 'Content-Type: application/json' -d '{"image_url": "https://go.heartex.net/static/samples/sample.jpg"}' http://localhost:8080/predict
```
Feel free to play around any other models & frameworks apart from image classifiers! (see instructions [here](https://github.com/heartexlabs/pyheartex#advanced-usage))
## Label Studio for Teams, Startups, and Enterprises :office:

Label Studio for Teams is our enterprise edition (cloud & on-prem), that includes a data manager, high-quality baseline models, active learning, collaborators support, and more. Please visit the [website](https://www.heartex.ai/) to learn more.
Expand All @@ -205,6 +199,22 @@ Label Studio for Teams is our enterprise edition (cloud & on-prem), that include
| [label-studio-converter](https://github.com/heartexlabs/label-studio-converter) | Encode labels into the format of your favorite machine learning library |
| [label-studio-transformers](https://github.com/heartexlabs/label-studio-transformers) | Transformers library connected and configured for use with label studio |

## Citation

```tex
@misc{Label Studio,
title={{Label Studio}: A Swiss Army Knife of Data Labeling and Annotation Tools},
url={https://github.com/heartexlabs/label-studio},
note={Open source software available from https://github.com/heartexlabs/label-studio},
author={
Maxim Tkachenko and
Mikhail Malyuk and
Nikita Shevchenko and
Nikolai Liubimov},
year={2020},
}
```

## License

This software is licensed under the [Apache 2.0 LICENSE](/LICENSE) © [Heartex](https://www.heartex.ai/). 2020
Expand Down
4 changes: 3 additions & 1 deletion app.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
{
"name": "Label Studio",
"description": "Multi-type data labeling, annotation and exploration tool",
"keywords": ["data annotation", "data labeling"],
"website": "https://labelstud.io",
"repository": "https://github.com/heartexlabs/label-studio",
"logo": "https://labelstud.io/images/opossum/heartex_icon_opossum_green.svg"
"logo": "https://labelstud.io/images/opossum/heartex_icon_opossum_green.svg",
"stack": "container"
}
2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ services:
working_dir: /label-studio
volumes:
- ./my_project:/label-studio/my_project
command: "label-studio start my_project ${INIT_COMMAND} "
command: "label-studio start my_project ${INIT_COMMAND} --host 0.0.0.0"
ports:
- "8080:8080"
restart: always
24 changes: 17 additions & 7 deletions docs/source/guide/ml.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,13 @@ type: guide
order: 906
---

You can easily connect your favorite machine learning framework with Label Studio by using [Heartex SDK](https://github.com/heartexlabs/pyheartex).
You can easily connect your favorite machine learning framework with Label Studio Machine Learning SDK.

That gives you the opportunities to use:
- **Pre-labeling**: Use model predictions for pre-labeling
- **Pre-labeling**: Use model predictions for pre-labeling (e.g. make use on-the-fly model predictions for creating rough image segmentations for further manual refinements)
- **Autolabeling**: Create automatic annotations
- **Online Learning**: Simultaneously update (retrain) your model while new annotations are coming
- **Active Learning**: Perform labeling in active learning mode
- **Active Learning**: Perform labeling in active learning mode - select only most complex examples
- **Prediction Service**: Instantly create running production-ready prediction service


Expand All @@ -21,28 +22,37 @@ That gives you the opportunities to use:

## Quickstart

Here is a quick example tutorial on how to do that with simple text classification:
Here is a quick example tutorial on how to run the ML backend with a simple text classifier:

0. Clone repo
```bash
git clone https://github.com/heartexlabs/label-studio
```

1. Create new ML backend
1. Setup environment
```bash
cd label-studio
pip install -e .
cd label_studio/ml/examples
pip install -r requirements.txt
```

2. Create new ML backend
```bash
label-studio-ml init my_ml_backend --script label-studio/ml/examples/simple_text_classifier.py
```

2. Start ML backend server
3. Start ML backend server
```bash
label-studio-ml start my_ml_backend
```

3. Run Label Studio connecting it to the running ML backend:
4. Run Label Studio connecting it to the running ML backend:
```bash
label-studio start text_classification_project --init --template text_sentiment --ml-backend-url http://localhost:9090
```


## Create your own ML backend

Check examples in `label-studio/ml/examples` directory.
33 changes: 29 additions & 4 deletions docs/source/guide/tasks.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ Here is an example of a config and tasks list composed of one element, for text
"choices": ["Neutral"]
}
}],
# score is used for active learning sampling mode
"score": 0.95
}]
}]
Expand Down Expand Up @@ -146,27 +147,31 @@ You can split your input data into several plain text files, and specify the dir
### Directory with image files

```bash
label-studio init --input-path=dir/with/images --input-format=image-dir --label-config=config.xml
label-studio init --input-path=dir/with/images --input-format=image-dir --label-config=config.xml --allow-serving-local-files
```

> WARNING: "--allow-serving-local-files" is intended to use only for locally running instances: avoid using it for remote servers unless you are sure what you're doing.
You can point to a local directory, which is scanned recursively for image files. Each file is used to create one task. Since Label Studio works only with URLs, a web link is created for each task, pointing to your local directory as follows:

```
http://<host:port>/static/filename?d=<path/to/the/local/directory>
http://<host:port>/data/filename?d=<path/to/the/local/directory>
```

Supported formats are: `.png` `.jpg` `.jpeg` `.tiff` `.bmp` `.gif`

### Directory with audio files

```bash
label-studio init --input-path=my/audios/dir --input-format=audio-dir --label-config=config.xml
label-studio init --input-path=my/audios/dir --input-format=audio-dir --label-config=config.xml --allow-serving-local-files
```

> WARNING: "--allow-serving-local-files" is intended to use only for locally running instances: avoid using it for remote servers unless you are sure what you're doing.
You can point to a local directory, which is scanned recursively for audio files. Each file is used to create one task. Since Label Studio works only with URLs, a web link is created for each task, pointing to your local directory as follows:

```
http://<host:port>/static/filename?d=<path/to/the/local/directory>
http://<host:port>/data/filename?d=<path/to/the/local/directory>
```

Supported formats are: `.wav` `.aiff` `.mp3` `.au` `.flac`
Expand All @@ -180,3 +185,23 @@ Use API to import tasks in [Label Studio basic format](tasks.html#Basic-format)
curl -X POST -H Content-Type:application/json http://localhost:8080/api/import \
--data "[{\"my_key\": \"my_value_1\"}, {\"my_key\": \"my_value_2\"}]"
```

## Sampling

You can define the way of how your imported tasks are exposed to annotators. Several options are available. To enable one of them, specify `--sampling=<option>` as command line option.

#### sequential

Tasks are ordered ascending by their `"id"` fields. This is default mode.

#### uniform

Tasks are sampled with equal probabilities.

#### prediction-score-min

Task with minimum average prediction score is taken. When this option is set, `task["predictions"]` list should be presented along with `"score"` field within each prediction.

#### prediction-score-max

Task with maximum average prediction score is taken. When this option is set, `task["predictions"]` list should be presented along with `"score"` field within each prediction.
3 changes: 1 addition & 2 deletions heroku.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
build:
docker:
web: Dockerfile

run:
web: "/app/scripts/run-demo.sh image_bbox"
web: ./tools/run.sh
17 changes: 17 additions & 0 deletions label_studio/examples/html_classification/config.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
<!-- {"title": "Web page classification", "category": "html", "complexity": "basic", "order": "!"} -->
<View>
<Choices name="toxicity" toName="web_page" choice="multiple" showInline="true">
<Choice value="Toxic" background="red"/>
<Choice value="Severe Toxic" background="brown"/>
<Choice value="Obsene" background="green"/>
<Choice value="Threat" background="blue"/>
<Choice value="Insult" background="orange"/>
<Choice value="Identity Hate" background="grey"/>
</Choices>

<View style="border: 1px solid #CCC;
border-radius: 10px;
padding: 5px">
<HyperText name="web_page" value="$text"/>
</View>
</View>
16 changes: 12 additions & 4 deletions label_studio/ml/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,31 @@
## Quickstart

Here is a quick example tutorial on how to do that with simple text classification:
Here is a quick example tutorial on how to run the ML backend with a simple text classifier:

0. Clone repo
```bash
git clone https://github.com/heartexlabs/label-studio
```

1. Create new ML backend
1. Setup environment
```bash
cd label-studio
pip install -e .
cd label_studio/ml/examples
pip install -r requirements.txt
```

2. Create new ML backend
```bash
label-studio-ml init my_ml_backend --script label-studio/ml/examples/simple_text_classifier.py
```

2. Start ML backend server
3. Start ML backend server
```bash
label-studio-ml start my_ml_backend
```

3. Run Label Studio connecting it to the running ML backend:
4. Run Label Studio connecting it to the running ML backend:
```bash
label-studio start text_classification_project --init --template text_sentiment --ml-backend-url http://localhost:9090
```
Expand Down
7 changes: 5 additions & 2 deletions label_studio/ml/default_configs/_wsgi.py.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,11 @@ from {script} import {model_class}
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Label studio')
parser.add_argument(
'--port', dest='port', type=int, default=9090,
'-p', '--port', dest='port', type=int, default=9090,
help='Server port')
parser.add_argument(
'--host', dest='host', type=str, default='0.0.0.0',
help='Server host')
parser.add_argument(
'--kwargs', dest='kwargs', metavar='KEY=VAL', nargs='+', type=lambda kv: kv.split('='),
help='Additional LabelStudioMLBase model initialization kwargs')
Expand Down Expand Up @@ -70,4 +73,4 @@ if __name__ == "__main__":
**kwargs
)

app.run(host='localhost', port=args.port, debug=args.debug)
app.run(host=args.host, port=args.port, debug=args.debug)
Loading

0 comments on commit 64b6cc8

Please sign in to comment.