Inference with Skiparse Attn #539

chen-yy20 · 2024-11-21T05:37:09Z

Hi OpenSora-Plan Team,

Thank you for your excellent work!

I noticed that a new attention pattern was used when training model v1.3, as described below:

Considering both computational load and AVG Attention Distance, we select Skiparse with k=4, replacing the first and last two blocks with Full 3D Attention to enhance performance.

I have two questions:

Do I need to use the same Skiparse Attention arguments when doing inference with the provided model weights?
Can I adjust these parameters to match my computational resources?

Thanks!

yunyangge · 2024-11-23T05:53:01Z

Yes, you need to use the same Skiparse Attention arguments during inference as were used during training. Although Skiparse Attention only modifies the forward process of the model without changing the parameters, the sparse ratio is tied to the weights. Weights trained with a specific sparse ratio cannot perform zero-shot inference effectively with another sparse ratio.
Similar to answer 1, while zero-shot inference is not feasible, you can fine-tune the model with a custom sparse ratio. Based on our experience, such fine-tuning is straightforward and requires only a few tens of thousands of clips.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference with Skiparse Attn #539

Inference with Skiparse Attn #539

chen-yy20 commented Nov 21, 2024 •

edited

Loading

yunyangge commented Nov 23, 2024

Inference with Skiparse Attn #539

Inference with Skiparse Attn #539

Comments

chen-yy20 commented Nov 21, 2024 • edited Loading

yunyangge commented Nov 23, 2024

chen-yy20 commented Nov 21, 2024 •

edited

Loading