Skip to content

sparolab/TRIDENT

Repository files navigation

TRIDENT: Efficient Triple-Task Learning of Dehazing, Depth, and Uncertainty Estimation for Underwater 3-D Robot Visual Perception

IEEE Sensors Journal 2024

This repository represents the official implementation of the paper titled "TRIDENT: Efficient Triple-Task Learning of Dehazing, Depth, and Uncertainty Estimation for Underwater 3-D Robot Visual Perception".

ProjectPage Paper YouTube Docker License

Geonmo Yang, Younggun Cho

description

In this paper, we introduce a novel learning-based sensing system that tackles the multidimensional vision tasks in underwater; concretely, we deal with image enhancement, depth estimation, and uncertainty for 3-D visual systems. Also, we propose a TRIDENT model in a fast and lightweight manner; TRIDENT consists of three parallelized decoders and one backbone structure for efficient feature sharing. In addition, it is designed to be trained to express complex parameterization. In experimental evaluation on several standard datasets, we demonstrate that TRIDENT significantly outperforms other existing methods on image enhancement and depth estimation. Despite performing three tasks, our model has better efficiency than the others for both memory size and inference time. Finally, our joint learning approach demonstrates robustness in feature matching and seamlessly extends from 2-D to 3-D vision tasks.

πŸ› οΈ Prerequisites

  1. Run the demo locally (requires a GPU and an nvidia-docker2, see Installation Guide)

  2. Optionally, we provide instructions to use docker in multiple ways. (But, recommended using docker compose, see Installation Guide).

  3. The code requires python>=3.8, as well as pytorch>=1.7 and torchvision>=0.8. But, we don't provide the instructions to install both PyTorch and TorchVision dependencies. Please use nvidia-docker2 😁. Installing both PyTorch and TorchVision with CUDA support is strongly recommended.

  4. This code was tested on:

  • Ubuntu 22.04 LTS, Python 3.10.12, CUDA 11.7, GeForce RTX 3090 (pip)
  • Ubuntu 22.04 LTS, Python 3.8.6, CUDA 12.0, RTX A6000 (pip)
  • Ubuntu 20.04 LTS, Python 3.10.12, CUDA 12.1, GeForce RTX 3080ti (pip)

πŸš€ Contents Table

πŸ› οΈ Setup

  1. πŸ“¦ Prepare Repository & Checkpoints
  2. ⬇ Prepare Dataset
  3. πŸ‹ Prepare Docker Image and Run the Docker Container

πŸš€ Traning or Testing for TRIDENT

  1. πŸš€ Training for TRIDENT on Joint-ID Dataset
  2. πŸš€ Testing for TRIDENT on Joint-ID Dataset
  3. πŸš€ Testing for TRIDENT on Standard or Custom Dataset

✏️ ETC

  1. βš™οΈ Inference settings

  2. πŸŽ“ Citation

  3. βœ‰οΈ Contact


πŸ› οΈ Setup

πŸ“¦ Prepare Repository & Checkpoints

  1. Clone the repository (requires git):

    git clone https://github.com/sparolab/TRIDENT.git
    cd TRIDENT
  2. Let's call the path where TRIDENT's repository is located ${TRIDENT_root}.

  3. You don't need to download checkpoint files. Checkpoint files are already in ${TRIDENT_root}/TRIDENT/ckpt. Because the model is so light-weight, it's already in the git repo.


(back to table)

⬇ Prepare Dataset

dataset

  1. TRIDENT uses the dataset from the Joint-ID paper. Please download the Joint_ID_Dataset.zip

  2. Next, unzip the file named Joint_ID_Dataset.zip with the downloaded path as ${dataset_root_path}.

    sudo unzip ${dataset_root_path}/Joint_ID_Dataset.zip   # ${dataset_root_path} requires at least 2.3 Gb of space.
    # ${dataset_root_path} is the absolute path, not relative path.
  3. After downloading, you should see the following file structure in the Joint_ID_Dataset folder

    πŸ“¦ Joint_ID_Dataset
    ┣ πŸ“‚ train
    ┃ ┣ πŸ“‚ LR                  # GT for traning dataset
    ┃ ┃ ┣ πŸ“‚ 01_Warehouse  
    ┃ ┃ ┃ ┣ πŸ“‚ color           # enhanced Image
    ┃ ┃ ┃ ┃ ┣ πŸ“œ in_00_160126_155728_c.png
    ┃ ┃ ┃ ┃       ...
    ┃ ┃ ┃ ┃
    ┃ ┃ ┃ β”— πŸ“‚ depth_filled    # depth Image
    ┃ ┃ ┃   ┣ πŸ“œ in_00_160126_155728_depth_filled.png
    ┃ ┃ ┃         ...
    ┃ ┃ ...
    ┃ β”— πŸ“‚ synthetic           # synthetic distorted dataset
    ┃   ┣ πŸ“œ LR@[email protected]
    ┃   ┣      ...
    ┃  
    β”— πŸ“‚ test         # 'test'folder has same structure as 'train'folder
          ...
    
  4. After downloading, you should see the following file structure in the Joint_ID_Dataset folder

  5. If you want to know the dataset, then see the project page for additional dataset details.


(back to table)

πŸ‹ Prepare Docker Image and Run the Docker Container

To run a docker container, we need to create a docker image. There are two ways to create a docker image and run the docker container.

  1. Use docker pull or:

    # download the docker image
    docker pull ygm7422/official_trident:latest    
    
    # run the docker container
    nvidia-docker run \
    --privileged \
    --rm \
    --gpus all -it \
    --name trident \
    --ipc=host \
    --shm-size=256M \
    --net host \
    -v /tmp/.X11-unix:/tmp/.X11-unix \
    -e DISPLAY=unix$DISPLAY \
    -v /root/.Xauthority:/root/.Xauthority \
    --env="QT_X11_NO_MITSHM=1" \
    -v ${dataset_root_path}/Joint_ID_Dataset:/root/workspace/dataset_root \
    -v ${TRIDENT_root}/TRIDENT:/root/workspace \
    ygm7422/official_trident:latest 
  2. Use docker compose (this is used to build docker iamges and run container simultaneously):

    cd ${TRIDENT_root}/TRIDENT
    
    # build docker image and run container simultaneously
    bash run_docker.sh up gpu ${dataset_root_path}/Joint_ID_Dataset
    
    # Inside the container
    docker exec -it TRIDENT bash

Regardless of whether you use method 1 or 2, you should have a docker container named TRIDENT running.


(back to table)

πŸš€ Traning or Testing for TRIDENT

πŸš€ Two Task Training for TRIDENT on Joint-ID Dataset

  1. First, move to the /root/workspace folder inside the docker container. Then, run the following command to start the training.
    # move to workspace
    cd /root/workspace
    
    # start two task (image enhancement & depth estimation) training on Joint-ID Dataset
    python run.py local_configs/arg_joint_train_trident.txt
  2. The model's checkpoints and log files are saved in the /root/workspace/save folder.
  3. If you want to change the default variable setting for training, see Inference settings below.

πŸš€ Three Task Training for TRIDENT on Joint-ID Dataset

  1. If you want to use uncertainty module, then run the following command to start the training. (You should have already finished the first training.)
    # move to workspace
    cd /root/workspace
    
    # start uncertainty module training on Joint-ID Dataset
    python run.py local_configs/arg_triple_train_trident.txt

(back to table)

πŸš€ Testing for TRIDENT on Joint-ID Dataset

  1. First, move to the /root/workspace folder inside the docker container. Then, run the following command to start the testing.

    # move to workspace
    cd /root/workspace
    
    # start to test on Joint-ID Dataset
    python run.py local_configs/arg_joint_test_trident.txt
    
    or
    
    # start to test on Joint-ID Dataset
    python run.py local_configs/arg_triple_test_trident.txt
  2. The test images and results are saved in the result_joint.TRIDENT_two_task or result_joint.TRIDENT_three_task folder.

  3. If you want to change the default variable setting for testing, see Inference settings below.


(back to table)

πŸš€ Testing for TRIDENT on Standard or Custom Dataset

  1. Set the dataset related variables in the local_configs/cfg/TRIDENT_two_task.py file. Below, enter the input image path in the sample_test_data_path variable.

    ...
    
    # If you want to adjust the image size, adjust the `image_size` below.
    image_size = dict(input_height=288,
                      input_width=512)
    ...
    
    # Dataset
    dataset = dict(
               train_data_path='dataset_root/train/synthetic',
               ...
               # sample_test_data_path='${your standard or custom dataset path}',
               sample_test_data_path='demo',
               video_txt_file=''
               )
    ...
  2. First, move to the /root/workspace folder inside the docker container. Then, run the following command to start the testing.

    # move to workspace
    cd /root/workspace
    
    # start to test on standard datasets
    python run.py local_configs/arg_joint_samples_test.txt
    
    or
    
    # start to test on standard datasets
    python run.py local_configs/arg_triple_samples_test.txt
  3. The test images and results are saved in the sample_eval_result_joint.TRIDENT_two_task folder or sample_eval_result_joint.TRIDENT_three_task folder.


(back to table)

βš™οΈ Inference settings

We set the hyperparameters in 'local_configs/cfg/joint.diml.joint_id.py'.

depth_range: Range of depth we want to estimate

image_size: the size of the input image data. If you set this variable, make sure to set auto_crop to False in train_dataloader_cfg, or eval_dataloader_cfg, or test_dataloader_cfg, or sample_test_cfg below. If you do not want to set image_size, please set auto_crop to True. auto_crop will be input to the model at the original size of the input data.

train_parm: hyperparameters to set when training.

test_parm: hyperparameters to set when testing.


(back to table)

πŸŽ“ Citation

Please cite our paper:

@article{yang2024trident,
  title={TRIDENT: Efficient Triple-Task Learning of Dehazing, Depth and Uncertainty Estimation for Underwater 3D Robot Visual Perception},
  author={Yang, Geonmo and Cho, Younggun},
  journal={IEEE Sensors Journal},
  year={2024},
  publisher={IEEE}
}

(back to table)

βœ‰οΈ Contact

Geonmo Yang: [email protected]

Project Link: https://sites.google.com/view/underwater-trident/home


(back to table)

🎫 License

For academic usage, the code is released under the GPL License, Version 3.0 (as defined in the LICENSE). For any commercial purpose, please contact the authors.

License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages