To fuel the Spoken Language Processing (SLP) research on meetings and tackle the key challenges, the Speech Lab and the Language Technology Lab and the ModelScope Community of Alibaba Group, Alibaba Cloud Tianchi Platform, and Zhejiang University launch a General Meeting Understanding and Generation (MUG) challenge, as an ICASSP2023 Signal Processing Grand Challenge. To facilitate the MUG challenge, we construct and release a meeting dataset, the AliMeeting4MUG Corpus (AMC).
To the best of our knowledge, AMC is so far the largest meeting corpus in scale and facilitates the most SLP tasks. The MUG challenge includes five tracks: Track 1 Topic Segmentation (TS), Track 2 Topic-level and Session-level Extractive Summarization (ES), Track 3 Topic Title Generation (TTG), Track 4 Keyphrase Extraction (KPE), and Track 5 Action Item Detection (AID).
Track 2 Topic-level and Session-level Extractive Summarization
Track 3 Topic Title Generation
git clone https://github.com/alibaba-damo-academy/SpokenNLP.git
Register on the ModelScope and get your token in the individual center page. Then Modify Configuration Files
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
conda create -n modelscope python=3.7
conda activate modelscope
- CUDA 10.2
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=10.2 -c pytorch
- CUDA 11.3
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
- CUDA 11.6
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
For more versions, please see https://pytorch.org/get-started/locally/
pip install "modelscope[nlp]==1.1.0" -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
pip install -r requirements.txt
sh run_ponet_topic_segmentation.sh
The baseline model is available at nlp_ponet_document-segmentation_topic-level_chinese-base.
The AMC dev set results from the baseline system and systems based on other backbone models on topic segmentation are as follows:
Model | Backbone | Positive F1 |
---|---|---|
Longformer | IDEA-CCNL/Erlangshen-Longformer-110M | 23.24±1.35 |
PoNet(baseline system) | damo/nlp_ponet_fill-mask_chinese-base | 25.10±0.55 |
Note: The mean and standard deviation (e.g., for 25.10±0.55, 25.10 is the mean, 0.55 is the std) are reported based on results from 5 runs with different random seeds.
sh run_ponet_topic_extractive_summarization.sh
sh run_ponet_doc_extractive_summarization.sh
python ./utils/extractive_summarization_submit_file_generation.py topic_ES_submit_file_path doc_ES_submit_file_path output_file_path
The baseline model of topic-level ES is available at nlp_ponet_extractive-summarization_topic-level_chinese-base.
The AMC dev set results from the baseline system and systems based on other backbone models on topic-level ES are as follows:
Model | Backbone | Ave. R1 | Ave. R2 | Ave. RL | Max R1 | Max R2 | Max RL |
---|---|---|---|---|---|---|---|
Longformer | IDEA-CCNL/Erlangshen-Longformer-110M | 50.45±0.30 | 34.15±0.48 | 44.62±0.58 | 63.22±0.25 | 50.72±0.31 | 60.36±0.37 |
PoNet | damo/nlp_ponet_fill-mask_chinese-base | 52.52±0.41 | 35.50±0.36 | 45.87±0.44 | 66.43±0.26 | 53.77±0.43 | 63.03±0.27 |
Note: The mean and standard deviation (e.g., for 25.10±0.55, 25.10 is the mean, 0.55 is the std) are reported based on results from 5 runs with different random seeds. We report both average and best Rouge-1,2,L scores based on the three references.
The baseline model of session-level ES is available at nlp_ponet_extractive-summarization_doc-level_chinese-base.
The dev results of baselines on session-level ES are as follows:
Model | Backbone | Ave. R1 | Ave. R2 | Ave. RL | Max R1 | Max R2 | Max RL |
---|---|---|---|---|---|---|---|
Longformer | IDEA-CCNL/Erlangshen-Longformer-110M | 56.17±0.33 | 29.52±0.65 | 38.20±1.51 | 61.75±0.45 | 36.84±0.61 | 47.06±1.20 |
PoNet(baseline system) | damo/nlp_ponet_fill-mask_chinese-base | 56.82±0.22 | 29.73±0.25 | 37.52±0.74 | 61.66±0.37 | 36.89±0.57 | 46.20±0.56 |
Note: The mean and standard deviation are reported as 5 times run with different seeds. We report both average and best Rouge-1,2,L scores based on the three references.
sh run_palm_topic_title_generation.sh
The baseline model of topic title generation is available at nlp_palm2.0_text-generation_meeting_title_chinese-base.
The AMC dev set results from the baseline system and systems based on other backbone models on topic title generation are as follows:
Model | Backbone | Rouge-1 | Rouge-L |
---|---|---|---|
BART | fnlp/bart-base-chinese | 31.06 | 28.92 |
PALM2.0(baseline system) | damo/nlp_palm2.0_pretrained_chinese-base | 31.28 | 29.43 |
- Note: batchsize=4 if gpuMemory = 16G
Please refer to src/keyphrase_extraction/readme
The baseline model of keyphrase extraction is available at nlp_structbert_keyphrase-extraction_base-icassp2023-mug-track4-baseline.
The AMC dev set results from the baseline system and systems based on other backbone models on keyphrase extraction are as follows:
Model | Backbone | Exact/Partial F1 @10 | Exact/Partial F1 @15 | Exact/Partial F1 @20 |
---|---|---|---|---|
YAKE | - | 15.0/24.3 | 19.8/30.4 | 20.4/32.1 |
Bert-CRF | sijunhe/nezha-cn-base | 35.6/43.2 | 38.1/49.5 | 37.2/48.1 |
Bert-CRF(baseline system) | damo/nlp_structbert_backbone_base_std | 35.9/47.7 | 40.1/52.2 | 39.4/51.1 |
sh run_structbert_action_item_detection.sh
The baseline model of action item detection is available at nlp_structbert_alimeeting_action-classification_chinese-base.
The AMC dev set results from the baseline system and systems based on other backbone models on action item detection are as follows:
Model | Backbone | Positive F1 |
---|---|---|
BERT | mengzi-bert-base | 68.15 |
StructBERT(baseline system) | damo/nlp_structbert_backbone_base_std | 69.43 |
if you want to run evaluation and get rank score on dev dataset, please refer to utils/challenge_evaluate.py
- Task Evaluation and Rank Score Computation: metrics/* and utils/challenge_evaluate.py
- Get Dataset and Parse for each task: please read alimeeting4mug_data_download function in each tasks baseline.
If you have any questions about AliMeeting4MUG, please contact us by
-
email: [email protected]
-
Dingding group:
- We borrowed a lot of code from ModelScope and Transformers.
This project is licensed under the Apache License 2.0. This project also contains various third-party components and some code modified from other repos under other open source licenses.