diff --git a/index.html b/index.html index fe83db1..06df43c 100644 --- a/index.html +++ b/index.html @@ -402,7 +402,7 @@
We use LeanDojo Benchmark to train and evaluate ReProver. During testing, the tactic generator is combined with best-first search to search for complete proofs. The table below shows the percentage of theorem proved within 10 minutes. The \(\texttt{novel_premises}\) spilt is much more challenging. On both data splits, ReProver outperforms Lean's built-in proof automation tactic (tidy), a baseline that generates tactics directly without retrieval, and another baseline using GPT-4 to generate tactics in a zero-shot manner.
+We use LeanDojo Benchmark to train and evaluate ReProver. During testing, the tactic generator is combined with best-first search to search for complete proofs. The table below shows the percentage of theorem proved within 10 minutes. The \(\texttt{novel_premises}\) spilt is much more challenging than the \(\texttt{random}\) split. On both data splits, ReProver outperforms Lean's built-in proof automation tactic (tidy), a baseline that generates tactics directly without retrieval, and another baseline using GPT-4 to generate tactics in a zero-shot manner.