Tips? #8

langestefan · 2025-01-30T17:48:34Z

Any tips to get this to run? I haven't been able to generate anything succesfully.

For example, I tried the https://jump.dev/ logo, which should be really simple. With default settings this is the result:

potamides · 2025-02-03T09:01:10Z

First, which exact model are you running? DeTikZify is primarily trained on scientific figures, so logos like your input are likely very challenging for the model. But there are some tricks you could try to still get outputs closer to what you want.

This is the output with DeTikZify_v2 I get after 5 minutes of MCTS. Far from perfect, but at least it correctly aligns the text and the logo position:

What often helps is reducing the complexity of the input. So here I removed the text and tried to only generate the logo part. The model does much better now after only a few tries:

Next I would manually assemble the first and second solutions and continue with clean up. We try to document some of these workflows and usage tips here.

Lastly, if you are using DeTikZify_v2, you might also want to consider trying one of the v1 models. In the v1 models, we keep the vision encoder frozen, which might help with generalization to out-of-distribution inputs like your logo; however, this is an untested hypothesis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tips? #8

Tips? #8

langestefan commented Jan 30, 2025

potamides commented Feb 3, 2025

Tips? #8

Tips? #8

Comments

langestefan commented Jan 30, 2025

potamides commented Feb 3, 2025