Skip to content

vancyland/vancyland.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 

Repository files navigation

Image

The following video is generated by animatediff v3 using the above image as the reference. It can be observed that the consistency of the generated video is poor; for example, within the red box in the image, the sail of the boat has turned into several segments.

Introducing popular methods to enhance video generation consistency, specifically the cross-frame attention mechanism.

In particular, by adding SparseCausal Attention, the first frame, and averaged frame information to the SA of all upblocks across all timesteps, we generate the following three videos.

It can be observed that while consistency has indeed improved, the quality of the generated videos has decreased, potentially resulting in a loss of motion information.

To further analyze, we conduct an ablation study on timesteps and upblocks.

Ablation of timesteps [1-25]: For the case where timesteps are greater than [0, 4, 5], we inject the first frame and averaged frame information into the SA, resulting in the following outcomes.

0

4

5

Ablation of upblocks 1, 2, and 3: We separately inject the first frame and averaged frame information into the SA of upblocks 1, 2, and 3. The results of this approach are as follows.

upblock.123

upblock.1

upblock.2

upblock.3

None

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published