Skip to content

Commit

Permalink
Update results
Browse files Browse the repository at this point in the history
  • Loading branch information
capjamesg committed Feb 19, 2025
1 parent bcb1d2f commit 0d4a37e
Show file tree
Hide file tree
Showing 2 changed files with 111 additions and 5 deletions.
10 changes: 5 additions & 5 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ <h1>How's GPT O1 Doing?</h1>
<p>You can contribute your own tests, too! See the <a href="https://github.com/roboflow/gpt-checkup?tab=readme-ov-file#-contribute">GitHub README</a> for contributing instructions.</p>
</div>
<div class="header_subtitle">
<p>Tests are run every day at 1am PT. Last updated February 18, 2025.</p>
<p>Tests are run every day at 1am PT. Last updated February 19, 2025.</p>
<p>Made with ❤️ by the team at <a href="https://roboflow.com">Roboflow</a>.</p>
</div>
<div class="header_cta">
Expand All @@ -58,12 +58,12 @@ <h1>How's GPT O1 Doing?</h1>
<div class="feature_header" style="min-height: auto">
<div class="feature_header_text" style="gap: var(--spacing-sizing-4)">
<h2>Response Time</h2>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.8 seconds</b> per request.</p>
<p style="font-size: 16px; color: var(--gray-700)">Today, the average response time to receive results from our tests was <b>3.82 seconds</b> per request.</p>
<p class="subtitle">This number only accounts for requests made by this application.</p>
</div>
<div class="chart">
<div class="chart_box chart_box_green">
<p>3.8 s</p>
<p>3.82 s</p>
</div>
</div>
</div>
Expand Down Expand Up @@ -176,7 +176,7 @@ <h3><span class="explainer_icon far fa-comment-dots"></span>Prompt</h3>
<h3><span class="explainer_icon far fa-image"></span>Image</h3>
<img class="test_image" src="images/swift.png" alt="Image of the input into GPT-4" />
<h3><span class="explainer_icon far fa-sparkles"></span>Result</h3>
<pre></pre>
<pre>I was</pre>
<p class="subtitle" style="margin-top: 16px; text-align: center">Test submitted by <a href="https://roboflow.com" target="_blank">Roboflow</a></p>
</div>
</div>
Expand Down Expand Up @@ -765,7 +765,7 @@ <h2>Zero Shot Classification</h2>
</div>
</div>
<p class="result_text">Of the last 7 tests, conducted daily, this test has passed <b>100%</b> of the time.</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.01</p>
<p class="request_price"><i class="far fa-coins"></i>Today's request cost $0.008</p>
</div>
<div class="explainer_dropdown">
<button type="button" class="dropdown dropdown_learn active">Learn about this test</button>
Expand Down
106 changes: 106 additions & 0 deletions results/2025-02-19.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
{
"zero_shot_classification": {
"score": 1,
"success": true,
"price": 0.008199999999999999,
"pass_fail": "Pass",
"response_time": 8.38942813873291,
"result": "Toyota Camry"
},
"count_fruit": {
"score": 0,
"success": false,
"price": 0.01545,
"pass_fail": "Fail",
"response_time": 7.712848663330078,
"result": ""
},
"document_ocr": {
"score": 0,
"success": false,
"price": 0.013989999999999999,
"pass_fail": "Fail",
"response_time": 8.065871477127075,
"result": "I was"
},
"handwriting_ocr": {
"score": 0,
"success": false,
"price": 0.015529999999999999,
"pass_fail": "Fail",
"response_time": 14.523216962814331,
"result": ""
},
"extraction_ocr": {
"score": 0,
"success": false,
"price": 0.013649999999999999,
"pass_fail": "Fail",
"response_time": 6.437875032424927,
"result": "Failed to produce a valid JSON output: "
},
"math_ocr": {
"score": 0,
"success": false,
"price": 0.02113,
"pass_fail": "Fail",
"response_time": 8.603970766067505,
"result": "Failed to produce a valid JSON output: "
},
"object_detection": {
"score": 0,
"success": false,
"price": 0.01584,
"pass_fail": "Fail",
"response_time": 7.256832599639893,
"result": "Failed to produce a valid JSON output: "
},
"graph_understanding": {
"score": 0,
"success": false,
"price": 0.0157,
"pass_fail": "Fail",
"response_time": 9.157629489898682,
"result": "Failed to produce a valid JSON output: "
},
"color_recognition": {
"score": 0,
"success": false,
"price": 0.0157,
"pass_fail": "Fail",
"response_time": 7.645347833633423,
"result": "Failed to produce a valid JSON output: "
},
"annotation_qa": {
"score": 0,
"success": false,
"price": 0.02135,
"pass_fail": "Fail",
"response_time": 9.838661670684814,
"result": "Failed to produce a valid JSON output: "
},
"measurement": {
"score": 0,
"success": false,
"price": 0.01566,
"pass_fail": "Fail",
"response_time": 10.821077108383179,
"result": "Failed to produce a valid JSON output: "
},
"easy_captcha": {
"score": 0,
"success": false,
"price": 0.01281,
"pass_fail": "Fail",
"response_time": 6.027362823486328,
"result": ""
},
"easy_captcha_persuade": {
"score": 0,
"success": false,
"price": 0.013309999999999999,
"pass_fail": "Fail",
"response_time": 9.13540267944336,
"result": ""
}
}

0 comments on commit 0d4a37e

Please sign in to comment.