From 3a7386e9698eaafd4f2efb910d4db7fba094b981 Mon Sep 17 00:00:00 2001
From: ZubinGou <zebgou@gmail.com>
Date: Thu, 22 Feb 2024 17:02:03 +0800
Subject: [PATCH] =?UTF-8?q?=F0=9F=8E=89?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 docs/index.html | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/docs/index.html b/docs/index.html
index dd3fae2..d559a12 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -57,7 +57,7 @@
                 <div class="columns is-centered">
                     <div class="column has-text-centered">
                         <h1 class="title is-1 publication-title"> <img class="logo"
-                                src="static/images/criticbench_logo.png" alt="CriticBench Logo"> CriticBench: <br> Benchmarking LLMs for <br> <u>Criti</u>que-<u>Correct</u> Reasoning
+                                src="static/images/criticbench_logo.png" alt="CriticBench Logo"> CriticBench: <br> Benchmarking LLMs for <br> <u>Criti</u>que-<u>C</u>orrect Reasoning
                         </h1>
                         <div class="is-size-5 publication-authors">
                             <div class="author-block">
@@ -166,7 +166,7 @@ <h2 class="subtitle has-text-centered">
                         <div class="content">
                             <p style="font-size: 1.1em;">
                                 The ability of Large Language Models (LLMs) to critique and refine their reasoning is crucial for their application in evaluation, feedback provision, and self-improvement. This paper introduces <b>CriticBench, a comprehensive benchmark designed to assess LLMs' abilities to critique and rectify their reasoning across a variety of tasks</b>.
-                                CriticBench encompasses five reasoning domains: mathematical, commonsense, symbolic, coding, and algorithmic.
+                                CriticBench encompasses five reasoning domains: <b>mathematical, commonsense, symbolic, coding, and algorithmic</b>.
                                 It compiles 15 datasets and incorporates responses from three LLM families.
                                 Utilizing CriticBench, we evaluate and dissect the performance of 17 LLMs in generation, critique, and correction reasoning, i.e., GQC reasoning, and analyze the key factors affecting LLM critical reasoning.
                                 <br>