Voxelization of semantic segmented images is a great way to add a 3D component to 2D images. It makes the objects stand out without the unnatural edge contours and weird coloration, and it was fun to implement.
The results for the single object semantic segmentation were taken from the 2016 DAVIS challenge: https://davischallenge.org/davis2016/code.html. Out of the 16 available techniques (semi-supervised and supervised), I used the BVS (Bilateral Space Video Segmentation) method. Although the error comparison suggests that there are other methods yielding better segmentation, I found this one to be most consistent across frames. Some segmentation techniques start off better, but fade out after several frames. More about BSV can be found here: https://graphics.ethz.ch/~perazzif/bvs/index.html.
This project uses semantic_segmentation.py to retrieve the masks from the DAVIS dataset and re-colors the object of interest in magenta, a color not seen in the video collection. Then, the images are processed in Houdini 16.5.350 in the provided .hip file, where they are voxelized. Houdini is a software that generates procedural content and allows for complex mesh manipulation among other things. Once the images have been voxelized, the create_video_from_sequences_of_images.py compiles the images into small videos. I've included small clips comparing the original unprocessed clips and their voxelized equivalent. I've framed the voxelized versions to enhance contrast between original and processed clips.