Skip to content

Commit

Permalink
update fig
Browse files Browse the repository at this point in the history
  • Loading branch information
hyungkwonko committed Dec 14, 2023
1 parent 12060a6 commit f87ea43
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 13 deletions.
26 changes: 13 additions & 13 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -131,19 +131,19 @@ <h2 class="subtitle has-text-centered">
<h2 class="title is-3">Abstract</h2>
<div class="content has-text-justified">
<p>
We introduce a Large Language Model (LLM) framework that generates rich and diverse NL
datasets using only Vega-Lite specifications as input, thereby streamlining the development
of Natural Language Interfaces (NLIs) for data visualization. We propose two techniques to
synthesize relevant chart semantics accurately and enhance syntactic diversity in each NL
dataset, respectively: 1) a guided discovery incorporated into prompting so that LLMs can
steer themselves to create varying NL datasets in a self-directed manner; 2) a score-based
paraphrasing to augment NL syntax along with four well-defined language axes. We also
present a new chart collection of 1,981 real-world Vega-Lite specifications that have
increased diversity and complexity compared to benchmarks, to demonstrate the
generalizability of our framework. The experimental results show that our framework
accurately extracts chart semantics and generates L1/L2 captions with 89.4% and 76.0%
accuracy, respectively, while generating and paraphrasing utterances and questions with
greater diversity than benchmarks.
We introduce VL2NL, a Large Language Model (LLM) framework that generates rich and diverse
NL datasets using only Vega-Lite specifications as input, thereby streamlining the
development of Natural Language Interfaces (NLIs) for data visualization. To synthesize
relevant chart semantics accurately and enhance syntactic diversity in each NL dataset, we
leverage 1) a guided discovery incorporated into prompting so that LLMs can steer themselves
to create faithful NL datasets in a self-directed manner; 2) a score-based paraphrasing to
augment NL syntax along with four language axes. We also present a new collection of 1,981
real-world Vega-Lite specifications that have increased diversity and complexity than
existing chart collections. When tested on our chart collection, VL2NL extracted chart
semantics and generated L1/L2 captions with 89.4% and 76.0% accuracy, respectively. It also
demonstrated generating and paraphrasing utterances and questions with greater diversity
compared to the benchmarks. Last, we discuss how our NL datasets and framework can be
utilized in real-world scenarios.
</p>
</div>
</div>
Expand Down
Binary file modified docs/static/img/fig3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit f87ea43

Please sign in to comment.