-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
creating dataset #38
Comments
Hi Kotresh, For the bootstrap version you could write a script that takes screenshots of existing bootstrap website templates and build a DSL vocabulary vocabulary based off that. It should be pretty straightforward with the structure looking like the pix2code datasets and DSL So for example a website that looks like this: https://imgur.com/a/IF3NxTV Would have a .gui that looks something like below:
For the HTML version, quoting the issue from Emil: “As mentioned in the article, the HTML version does not generalize on new images. The Bootstrap version generalizes on new images but with a capped vocabulary. The evaluation images for the bootstrap version are under /data/eval/ . You can test it here: floydhub/Bootstrap/test_model_accuracy.ipynb If you want to train it to generalize on a more advanced vocabulary, I'd recommend customizing it to work on the HTML set provided here: https://github.com/harvardnlp/im2markup (on floydhub: --data emilwallner/datasets/100k-html:data) After that, I'd recommend creating a new dataset. Create a script that generates random websites, say starting with newsletters or blog layouts. Then you can add optical character recognition, fonts, colors and div sizes as you go. If you build a version for the harvardnlp dataset or a script that generates websites, please make a pull request.” |
hi I'm pretty late to this but I was just wondering what is a .gui file and how do you open it ? |
@yuvarajvc: @salmanahmad10: The .gui name extension convention was used by the original paper (Pix2code) and has no special relevance. The project uses the .gui file to map the corresponding token sequence relationship to it's image pair which has the same name. ie. image1.png (or .npz when compressed) should have a corresponding .gui file called image1.gui which has it's textual token features representing the description of the image PS I'm pushing my dev toolkit here which includes 100 *samples and will be happy to sell my whole dataset. Email me at [email protected] |
Hi, many thanks for sharing the data and code. how can we take it forward, how can we generate more data apart from synthesised data. can we create same kind of dataset for real time html page. if so, then how can we generate .gui files for that. if you have any resource or any thoughts please do share us.
The text was updated successfully, but these errors were encountered: