Skip to content

Commit

Permalink
Update results
Browse files Browse the repository at this point in the history
  • Loading branch information
capjamesg committed Feb 20, 2025
1 parent 0d4a37e commit 79beccc
Show file tree
Hide file tree
Showing 2 changed files with 111 additions and 5 deletions.
10 changes: 5 additions & 5 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ <h1>How's GPT O1 Doing?</h1>
<p>You can contribute your own tests, too! See the <a href="https://github.com/roboflow/gpt-checkup?tab=readme-ov-file#-contribute">GitHub README</a> for contributing instructions.</p>
</div>
<div class="header_subtitle">
<p>Tests are run every day at 1am PT. Last updated February 19, 2025.</p>
<p>Tests are run every day at 1am PT. Last updated February 20, 2025.</p>
<p>Made with ❤️ by the team at <a href="https://roboflow.com">Roboflow</a>.</p>
</div>
<div class="header_cta">
Expand All @@ -58,12 +58,12 @@ <h1>How's GPT O1 Doing?</h1>
<div class="feature_header" style="min-height: auto">
<div class="feature_header_text" style="gap: var(--spacing-sizing-4)">
<h2>Response Time</h2>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.82 seconds</b> per request.</p>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.84 seconds</b> per request.</p>
<p class="subtitle">This number only accounts for requests made by this application.</p>
</div>
<div class="chart">
<div class="chart_box chart_box_green">
<p>3.82 s</p>
<p>3.84 s</p>
</div>
</div>
</div>
Expand Down Expand Up @@ -176,7 +176,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/swift.png" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre>I was</pre>
<pre>I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the Midnights album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -765,7 +765,7 @@ <h2>Zero Shot Classification</h2>
</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>100%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.008</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.01</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
Expand Down
106 changes: 106 additions & 0 deletions results/2025-02-20.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
{
"zero_shot_classification": {
"score": 1,
"success": true,
"price": 0.01012,
"pass_fail": "Pass",
"response_time": 7.093893051147461,
"result": "Toyota Camry"
},
"count_fruit": {
"score": 0,
"success": false,
"price": 0.01545,
"pass_fail": "Fail",
"response_time": 9.379559993743896,
"result": ""
},
"document_ocr": {
"score": 0,
"success": false,
"price": 0.013989999999999999,
"pass_fail": "Fail",
"response_time": 12.462197303771973,
"result": "I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the Midnights album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album"
},
"handwriting_ocr": {
"score": 0,
"success": false,
"price": 0.015529999999999999,
"pass_fail": "Fail",
"response_time": 11.129419326782227,
"result": ""
},
"extraction_ocr": {
"score": 0,
"success": false,
"price": 0.013649999999999999,
"pass_fail": "Fail",
"response_time": 11.524913311004639,
"result": "Failed to produce a valid JSON output: "
},
"math_ocr": {
"score": 0,
"success": false,
"price": 0.02113,
"pass_fail": "Fail",
"response_time": 10.938113451004028,
"result": "Failed to produce a valid JSON output: "
},
"object_detection": {
"score": 0,
"success": false,
"price": 0.01584,
"pass_fail": "Fail",
"response_time": 17.056734323501587,
"result": "Failed to produce a valid JSON output: "
},
"graph_understanding": {
"score": 0,
"success": false,
"price": 0.0157,
"pass_fail": "Fail",
"response_time": 8.02358365058899,
"result": "Failed to produce a valid JSON output: "
},
"color_recognition": {
"score": 0,
"success": false,
"price": 0.0157,
"pass_fail": "Fail",
"response_time": 7.4073710441589355,
"result": "Failed to produce a valid JSON output: "
},
"annotation_qa": {
"score": 0,
"success": false,
"price": 0.02135,
"pass_fail": "Fail",
"response_time": 31.52671527862549,
"result": "Failed to produce a valid JSON output: "
},
"measurement": {
"score": 0,
"success": false,
"price": 0.01566,
"pass_fail": "Fail",
"response_time": 10.37532639503479,
"result": "Failed to produce a valid JSON output: "
},
"easy_captcha": {
"score": 0,
"success": false,
"price": 0.01281,
"pass_fail": "Fail",
"response_time": 7.882266521453857,
"result": ""
},
"easy_captcha_persuade": {
"score": 0,
"success": false,
"price": 0.013309999999999999,
"pass_fail": "Fail",
"response_time": 5.799127817153931,
"result": ""
}
}

0 comments on commit 79beccc

Please sign in to comment.