MBVAT: Most Basic Vision Agent Testing

This project aims at providing a most basic visual agent testing. It testing the basic grounding of the visual agent, including the following aspects:

Recognition: how good can model recognize the object in pixel level?
Localization: how good can model localize the object in pixel level?

Run

Here is a sample to test color of the model:

python -m mbvat.main --test color --model qwen2vl --base-url http://localhost:8000/v1 --save-path results

Here is a sample to test localization of the model:

python -m mbvat.main --test localization --model qwen2vl --base-url http://localhost:8000/v1 --save-path results

Here is a sample to test colorful localization of the model:

python -m mbvat.main --test localization_colorful --model qwen2vl --base-url http://localhost:8000/v1 --save-path results

Results Example

Localization Test Results

Here is a sample results you will find in your localization test results: This shows three parameters in title:

Windows Ratio: the ratio of the object to be localized against the whole image.
Resolution: the resolution of the tested image.
Correctness: Corrected / Total Number of the tested images.

Each grid's color shows the correctness of the localization of the object in this grid. The number in each grid shows the minimum Delta E (CIE 2000) between the foreground and the background.

Color Test Results

Here is a sample results you will find in your color test results: This picture shows the failed color test results. The big "Triangle" shows the CIE 1976 color space. We make a gray mask on top of color space. The brighter the color, the better the model can recognize it.

Citation

If you find this project useful, please kindly cite it in your publications.

@misc{mbvat,
  author = {Yaxi Lu},
  title = {MBVAT: Most Basic Vision Agent Testing},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  url = {https://github.com/luyaxi/MostBasicVAgentTesting}
}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.vscode		.vscode
assets		assets
mbvat		mbvat
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MBVAT: Most Basic Vision Agent Testing

Run

Results Example

Localization Test Results

Color Test Results

Citation

About

Releases

Packages

Languages

License

luyaxi/MostBasicVAgentTesting

Folders and files

Latest commit

History

Repository files navigation

MBVAT: Most Basic Vision Agent Testing

Run

Results Example

Localization Test Results

Color Test Results

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages