Skip to content

Commit

Permalink
Merge branch 'main' into alpha
Browse files Browse the repository at this point in the history
  • Loading branch information
gary-Shen authored Aug 12, 2024
2 parents c053ddb + 3052904 commit 7a327e0
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 67 deletions.
65 changes: 32 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,29 @@

</div>

## Introduction
## Product Introduction

LabelU is a comprehensive data annotation platform designed for handling multimodal data. It offers a range of advanced annotation tools and efficient workflows, making it easier for users to tackle annotation tasks involving images, videos, and audio. LabelU is tailored to meet the demands of complex data analysis and model training.

## Key Features

### Versatile Image Annotation Tools
LabelU provides a comprehensive set of tools for image annotation, including 2D bounding boxes, semantic segmentation, polylines, and keypoints. These tools can flexibly address a variety of image processing tasks, such as object detection, scene analysis, image recognition, and machine translation, helping users efficiently identify, annotate, and analyze images.

### Powerful Video Annotation Capabilities
In the realm of video annotation, LabelU showcases impressive processing capabilities, supporting video segmentation, video classification, and video information extraction. It is highly suitable for applications such as video retrieval, video summarization, and action recognition, enabling users to easily handle long-duration videos, accurately extract key information, and support complex scene analysis, providing high-quality annotated data for subsequent model training.

### Efficient Audio Annotation Tools
Audio annotation tools are another key feature of LabelU. These tools possess efficient and precise audio analysis capabilities, supporting audio segmentation, audio classification, and audio information extraction. By visualizing complex sound information, LabelU simplifies the audio data processing workflow, aiding in the development of more accurate models.

#### Artificial Intelligence Assisted Labelling
LabelLLM supports one-click loading of pre-annotated data, which can be refined and adjusted according to actual needs. This feature improves the efficiency and accuracy of annotation.


https://github.com/user-attachments/assets/0fa5bc39-20ba-46b6-9839-379a49f692cf

LabelU offers a variety of annotation tools and features, supporting image, video, and audio annotation.

- Image: Multifunctional image processing tools encompassing 2D bounding box, cuboid, semantic segmentation, polylines, keypoints, and many other annotation tools, assist in completing image identification, annotation, and analysis.
- Video: The video annotation has robust video processing capabilities, able to implement video segmentation, video classification, video information extraction, and other functions, providing high-quality annotated data for model training.
- Audio: Highly efficient and accurate audio analysis tool can achieve audio segmentation, audio classification, audio information extraction, and other functions, making complex sound information visually intuitive.

<p align="center">
<img style="width: 600px" src="https://user-images.githubusercontent.com/25022954/209318236-79d3a5c3-2700-46c3-b59a-62d9c132a6c3.gif">
</p>

## Features

Expand Down Expand Up @@ -99,31 +111,6 @@ uvicorn labelu.main:app --reload
git submodule update --remote --merge
```

## Supported Scenarios

### Image

- Label Classification: Can help users quickly classify objects in images and can be used for image retrieval, object detection tasks.
- Text Description: Text transcription can help users quickly extract text information in images and can be used for text retrieval, machine translation tasks.
- Bounding Box: Can help users quickly select objects in images and can be used for image recognition, object tracking tasks.
- Point Annotation: Points can help users accurately label key information in the image and can be used for object recognition, scene analysis tasks.
- Polygon: Can help users accurately label irregular shapes and can be used for object recognition, scene analysis tasks.
- Line Annotation: Lines can help users accurately label edges and contours in the image and can be used for object recognition, scene analysis tasks.
- Cuboid: Cuboid can help users accurately label the size, shape, and location of objects within images, and can be used for object recognition, scene analysis tasks.

### Video

- Label Classification: Classifying and labeling videos can be used for video retrieval, recommendation, and classification tasks.
- Text Description: Converting speech content in videos into text can be used for voice recognition, transcription, and translation tasks.
- Segment Segmentation: Extracting specific clips or scenes from the video for annotation is very useful for video object detection, action recognition, and video summary tasks.
- Timestamps: Point to or mark specific parts of the video; users can click on timestamps to jump directly to that part of the video.

### Audio

- Label Classification: By listening to the audio and selecting the appropriate classification for annotation, it's applicable for audio retrieval, recommendations, and classification tasks.
- Text Description: Converting speech content in audio into text makes it easier for users to analyze and process text. It's very useful for voice recognition, transcription tasks, and can help users better understand and process voice content.
- Segment Segmentation: Extracting specific clips from audio for annotation is very useful for audio event detection, voice recognition, and audio editing tasks.
- Timestamps: Used to point to or mark specific parts of the audio; users can click on timestamps to jump directly to that part of the audio.

## Quick start

Expand All @@ -133,6 +120,17 @@ git submodule update --remote --merge

- [Documentation](https://opendatalab.github.io/labelU/#/schema)

## Citation

```bibtex
@article{he2024opendatalab,
title={Opendatalab: Empowering general artificial intelligence with open datasets},
author={He, Conghui and Li, Wei and Jin, Zhenjiang and Xu, Chao and Wang, Bin and Lin, Dahua},
journal={arXiv preprint arXiv:2407.13773},
year={2024}
}
```

## Communication

Welcome to the OpenDataLab official WeChat group!
Expand All @@ -141,6 +139,7 @@ Welcome to the OpenDataLab official WeChat group!
<img style="width: 400px" src="https://user-images.githubusercontent.com/25022954/208374419-2dffb701-321a-4091-944d-5d913de79a15.jpg">
</p>


## Links

- [LabelU-kit](https://github.com/opendatalab/labelU-Kit) Web front-end annotation kit (LabelU is based on this JavaScript kit)
Expand Down
51 changes: 17 additions & 34 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,26 @@

</div>

## 简介
## 产品介绍

LabelU 提供了多种标注工具和功能,支持图像、视频、音频标注
LabelU是一款综合性的数据标注平台,专为处理多模态数据而设计。该平台旨在通过提供丰富的标注工具和高效的工作流程,帮助用户更轻松地处理图像、视频和音频数据的标注任务,满足各种复杂的数据分析和模型训练需求

- 图像类的多功能图像处理工具,涵盖 2D 框、语义分割、多段线、关键点等多种标注工具,协助完成图像的标识、注释和分析。
- 视频类标注具备强大视频处理能力,可实现视频分割、视频分类、视频信息提取等功能,为模型训练提供优质标注数据。
- 音频类高效精准的音频分析工具,可实现音频分割、音频分类、音频信息提取等功能,将复杂的声音信息直观可视化。
## 特色功能

### 多功能图像标注工具
LabelU为图像标注提供了全面的工具集,包括2D框、语义分割、多段线、关键点等多种标注方式。这些工具能够灵活应对诸如目标检测、场景分析、图像识别、机器翻译等各种图像处理任务,帮助用户高效完成图像的标识、注释和分析。

### 强大的视频标注功能
视频标注方面,LabelU展现了强大的处理能力,支持视频分割、视频分类以及视频信息提取等功能。非常适合应用于视频检索、视频摘要、行为识别等任务,使用户能够轻松处理长时段视频,精准提取关键信息,支持复杂场景分析,为后续的模型训练提供高质量的标注数据。

### 高效的音频标注工具
音频标注工具是LabelU的另一大特色。该工具具备高效、精准的音频分析能力,支持音频分割、音频分类和音频信息提取。通过将复杂的声音信息直观化展示,LabelU简化了音频数据的处理流程,助力更准确的模型开发。

#### 人工智能辅助标注
LabelLLM支持预标注数据的一键载入,用户可以根据实际需要对其进行细化和调整。这一特性提高了标注的效率和准确性。

https://github.com/user-attachments/assets/f90e5a66-ab4d-456e-af4d-e6408a623812

<p align="center">
<img style="width: 600px" src="https://user-images.githubusercontent.com/25022954/209318236-79d3a5c3-2700-46c3-b59a-62d9c132a6c3.gif">
</p>

## 特性

Expand Down Expand Up @@ -99,32 +108,6 @@ uvicorn labelu.main:app --reload
git submodule update --remote --merge
```

## 支持场景

### 图片

- 标签分类:标签分类可以帮助用户快速将图像中的物体进行分类,并且可以用于图像检索、目标检测等任务。
- 文本描述:文本转写可以帮助用户快速提取图像中的文字信息,并且可以用于文本检索、机器翻译等任务。
- 拉框:拉框可以帮助用户快速选择图像中的物体,并且可以用于图像识别、目标跟踪等任务。
- 标点:点可以帮助用户准确地标注图像中的关键信息,并且可以用于物体识别、场景分析等任务。
- 多边形:多边形可以帮助用户准确地标注不规则形状,并且可以用于物体识别、场景分析等任务。
- 标线:线可以帮助用户准确地标注图像中的边缘和轮廓,并且可以用于物体识别、场景分析等任务。
- 立体框:立体框可以帮助用户准确地标注图像中的物体大小、形状、位置等信息,并且可以用于物体识别、场景分析等任务。

### 视频

- 标签分类:对视频进行分类和标签化,可运用于视频检索、推荐和分类等任务。
- 文本描述:将视频中的语音内容转化为文字,可用于语音识别、语音转写和语音翻译等任务。
- 片段分割:从视频中截取特定的片段或场景进行标注,对于视频目标检测、行为识别和视频摘要等任务非常有用。
- 时间戳:指向或标记视频中的特定部分,用户可以点击时间戳即可直接跳转到视频的那个部分。

### 音频

- 标签分类:通过听取音频并选择合适的分类来进行标注,适用于音频检索、音频推荐和音频分类等任务。
- 文本描述:将音频中的语音内容转化为文字,便于用户进行文本分析和处理。对于语音识别、语音转写等任务非常有用,可以帮助用户更好地理解和处理语音内容。
- 片段分割:从音频中截取特定的片段进行标注,对于音频事件检测、语音识别和音频编辑等任务非常有用。
- 时间戳:用于指向或标记音频中的特定部分,用户可以点击时间戳即可直接跳转到音频的那个部分。

## 快速上手

- [使用说明](https://opendatalab.github.io/labelU)
Expand Down

0 comments on commit 7a327e0

Please sign in to comment.