+
-
Method
-
+
Method
+
We present a novel unified model, i.e., Show-o,
capable of addressing both multimodal understanding and generation tasks simultaneously with mixed auto-regressive and diffusion modeling.
@@ -230,27 +226,27 @@
Method
-
+
-
Text-to-Image Results
-
![](./assets/images/github_t2i.png)
+
Text-to-Image Results
+
![](./assets/images/github_t2i.png)
-
Multimodal Understanding Results
-
![](./assets/images/github_mmu.png)
+
Multimodal Understanding Results
+
![](./assets/images/github_mmu.png)
-
Inpainting Results
-
![](./assets/images/github_inpainting.png)
-
Extrapolation Results
+
Inpainting Results
+
![](./assets/images/github_inpainting.png)
+
Extrapolation Results
-
![](./assets/images/github_extrapolation.png)
+
-
+
-
Experiments
+
Experiments
![](./assets/images/geneval_result.png)
@@ -260,9 +256,9 @@
Experiments
-
+
-
Comparison
+
Comparison