Resolve most skipped unittests #559

HYLcool · 2025-01-22T02:42:13Z

unit test opt:
- open most skipped unit tests
  - 20 unit tests for API-based OPs are all opened again by setting the API info with environment variables
  - 1 unit test for video_motion_score_raft_filter is opened again by setting more appropriate thresholds to handle different model outputs on different hardware, which is normal.
  - 7 unit tests for model-based OPs are opened again by decreasing the number of processors to avoid OOM of GPU memory.
- increase the shm_size to 128G to avoid implicit OOM issue with exit code 137 (ref)
- for now, there are still 9 unit tests that are skipped left to be resolved in future work
  - 2 unit tests are skipped due to randomness from system resource utilization (adapter & monitor)
  - 1 unit test is skipped due to an unsolved encoding problem (nlpcda_zh_mapper)
  - 1 unit test is skipped due to OOM of GPU (video_captioning_from_summarizer_mapper, which requires more than 50GB mem)
  - 5 unit tests are skipped due to an unsolved vllm error for distributed inference (similar to [Bug]: vllm.LLM does not seem to re-initialize for distributed inference with subsequent models with Offline Inference vllm-project/vllm#9727) (generate_*_mapper & optimize_*_mapper)
others:
- add default mem_required for some model-based OPs
- update sampling_params for vllm-based OPs due to vllm and transformers having similar inference parameters with different names.
- update device_map specification for latest diffusers library.
- fix some bugs & typos

…hod to move models to specified devices * fix unrecognized dtype: only need torch.dtype instead of strings like 'fp16' * open unittest for image_diffusion_mapper

…zation of meta and tags

* fix typos

# Conflicts: # configs/config_all.yaml # data_juicer/ops/aggregator/entity_attribute_aggregator.py # data_juicer/ops/aggregator/most_relavant_entities_aggregator.py # data_juicer/ops/aggregator/nested_aggregator.py # data_juicer/ops/mapper/video_captioning_from_summarizer_mapper.py # tests/ops/mapper/test_image_tagging_mapper.py

yxdyc

LGTM

HYLcool added 30 commits December 24, 2024 17:01

* fix missing attribute error and duplicate test funcs

20b1ffe

* fix unexpected keyword argument error

c57ca12

* review for OOM tests

38bfd06

* review for OOM tests

8fb34b3

- use 2 np instead of 4 np for unittest

f63a80e

- use 2 np instead of 4 np for unittest to resolve OOM problem

8a4e822

* update doc of analyzer

13460e5

* fix undefined device_map: using balanced in default or use to met…

4110a1a

…hod to move models to specified devices * fix unrecognized dtype: only need torch.dtype instead of strings like 'fp16' * open unittest for image_diffusion_mapper

* fix bugs in video_captioning_from_summarizer_mapper due to reorgani…

829a0c7

…zation of meta and tags

- remove unused imports

3aee531

* open unittest for nlpcda_en_mapper

4805928

* set the default encoding of stdout to utf-8

6020a41

* set the default encoding of stdout to utf-8

d464b79

* set the default encoding of stdout to utf-8

8981b63

* test for raft

2025a2e

* change the thresholds

85e757d

+ add mem_required for generate_qa_from_text_mapper

b41833e

* fix typos

+ add mem_required for two ops

fb503db

- open unittest for generate_qa_from_text_mapper

432de0f

- open unittest for generate_qa_from_examples_mapper

06267ba

Merge branch 'main' into resolve/unittest_skipping

c56a629

Merge branch 'main' into resolve/unittest_skipping

684666e

* fix skip_op_error & update_sampling_params

bece0d1

* update vllm version requirement for generation_config param

88e9aa4

* skip vllm ops

26d0f84

* open unittests for api-related ops

6874191

Merge branch 'main' into resolve/unittest_skipping

42e54bc

* fix wrong attr name

f52106d

* increase shm_size to avoid OOM

d2b0064

HYLcool added bug Something isn't working enhancement New feature or request dj:ci/cd issues/PRs about CI/CD of Data-Juicer environment related to third-party dependency, DJ-pypi, DJ-docker, etc. labels Jan 22, 2025

HYLcool requested review from BeachWang, chenyushuo, pan-x-c and yxdyc January 22, 2025 02:42

HYLcool self-assigned this Jan 22, 2025

HYLcool temporarily deployed to Testing January 22, 2025 02:42 — with GitHub Actions Inactive

yxdyc approved these changes Jan 22, 2025

View reviewed changes

HYLcool merged commit dbf880c into main Jan 22, 2025
4 checks passed

HYLcool deleted the resolve/unittest_skipping branch January 22, 2025 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve most skipped unittests #559

Resolve most skipped unittests #559

HYLcool commented Jan 22, 2025 •

edited

Loading

yxdyc left a comment

Resolve most skipped unittests #559

Resolve most skipped unittests #559

Conversation

HYLcool commented Jan 22, 2025 • edited Loading

yxdyc left a comment

Choose a reason for hiding this comment

HYLcool commented Jan 22, 2025 •

edited

Loading