You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First, which exact model are you running? DeTikZify is primarily trained on scientific figures, so logos like your input are likely very challenging for the model. But there are some tricks you could try to still get outputs closer to what you want.
This is the output with DeTikZifyv2 I get after 5 minutes of MCTS. Far from perfect, but at least it correctly aligns the text and the logo position:
What often helps is reducing the complexity of the input. So here I removed the text and tried to only generate the logo part. The model does much better now after only a few tries:
Next I would manually assemble the first and second solutions and continue with clean up. We try to document some of these workflows and usage tips here.
Lastly, if you are using DeTikZifyv2, you might also want to consider trying one of the v1 models. In the v1 models, we keep the vision encoder frozen, which might help with generalization to out-of-distribution inputs like your logo; however, this is an untested hypothesis.
Any tips to get this to run? I haven't been able to generate anything succesfully.
For example, I tried the https://jump.dev/ logo, which should be really simple. With default settings this is the result:
The text was updated successfully, but these errors were encountered: