Download the preprocessed dataset
Check the data repository here.
First, install git lfs
by following the instructions here.
To get the data, run:
cd /PATH/TO/THE/DATASET
git clone https://huggingface.co/datasets/robin-courant/et-data
Prepare the dataset (untar archives):
cd et-data
sh untar_and_move.sh
Create environment
Create conda environment:
conda create --name et python=3.10 -y
conda activate et
Install dependencies and SLAHMR (torch===1.13.1 and CUDA==11.7):
sh ./setup.sh
Install pytorch3d
(installation can be tricky, follow the official guidelines if you encounter any issues.):
conda install pytorch3d -c pytorch3d
Set up the dataset
The E.T. dataset is built upon CondensedMovies dataset. Follow the instructions here to download CondensedMovies.
Finally, add a symlink in this repository to the CondensedMovies repository:
ln -s PATH/TO/CondensedMovies ./data
E.T. frames extraction
Here are the instructions to extract frames from the E.T. dataset.
First, you need a copy of both the CondensedMovies and the E.T datasets.
Then, run the following script:
python scripts/misc/extract_frames.py /PATH/TO/CondensedMovies /PATH/TO/et-data
Here are the instructions for running the data extraction scripts to reproduce the E.T. dataset.
- Perform SLAHMR extraction:
python scripts/slahmr_extraction.py data.metadata_filename='data/metadata/clips.csv'
- Gather all extracted chunks into one file:
python scripts/extraction/slahmr_processing.py /PATH/TO/slahmr_out --dirname smooth_fit
- Align extracted chunks:
python scripts/processing/trajectory_alignment.py /PATH/TO/slahmr_out /PATH/TO/et-data
- Clean and smooth extracted trajectories:
python scripts/processing/trajectory_cleaning.py /PATH/TO/et-data
- Create dataset and split to samples:
python scripts/processing/dataset_processing.py /PATH/TO/et-data -c -v /PATH/TO/CondensedMovies
- Caption all samples - tagging + caption generation - (check
scripts/processing/configs/captioning
for config details):
python scripts/processing/dataset_captioning.py
- Shift all samples according to the origin of the character:
python scripts/misc/dataset_processing.py /PATH/TO/et-data
- Extract caption CLIP features:
python scripts/miscclip_extraction.py /PATH/TO/et-data -sq -t -cv ViT-B/32
There are 2 different ways of visualizing samples, using blender and rerun.
Note: You will need meshes, which are not yet released with the dataset.
Blender visualization
First, install blender:
- Follow the official instructions.
- Locate the python installation used by conda with the following line (
/PATH/TO/CONDA/ENV/
):conda env list | grep '*'
- Locate the python installation used by blender with the following line (
/PATH/TO/BLENDER/python
):blender --background --python-expr "import sys; import os; print('\nThe path to the installation of python of blender can be:'); print('\n'.join(['- '+x.replace('/lib/python', '/bin/python') for x in sys.path if 'python' in (file:=os.path.split(x)[-1]) and not file.endswith('.zip')]))"
- Link conda env to blender python with the following line:
ln -s /PATH/TO/CONDA/ENV/ /PATH/TO/BLENDER/python
To launch Blender through the command line, run:
blender PATH/TO/BLENDER_FILENAME
Then, in Blender, go to the Scripting
tab and open visualization/blender_viz.py
.
Next, go to the Modifiers
tab (wrench tool icon), enter your desired parameters, and generate your scene.
Rerun visualization
To launch Rerun visualization script, run:
python visualization/rerun_viz.py /PATH/TO/et-data