Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about GPT-4O score on MathVista? #27

Open
Luodian opened this issue May 21, 2024 · 1 comment
Open

Questions about GPT-4O score on MathVista? #27

Luodian opened this issue May 21, 2024 · 1 comment

Comments

@Luodian
Copy link

Luodian commented May 21, 2024

Hi I was wondering the score of GPT-4O, it's 63.8 on testmini. But I could only get around 55 at my side. Also I got little bit lower score for GPT-4V than 58.1, like around 51-52.

I tried different prompt strategies but that's the best scores I got.

Would you mind to share some prompt stragies for reproducing GPT-4 models performance? Thanks a lot!

@Luodian
Copy link
Author

Luodian commented May 21, 2024

This is the first question in testmini of the input/output for GPT-4O.

Hope it may clarify more about my prompting strategy. Does it look correct and best for GPT-4V or 4O? Thanks!

{
    "doc_id": 0,
    "target": "1.2",
    "query": "Hint: Please answer the question requiring a floating-point number with one decimal place and provide the final value, e.g., 1.2, 1.3, 1.4, at the end.\nQuestion: When a spring does work on an object, we cannot find the work by simply multiplying the spring force by the object's displacement. The reason is that there is no one value for the force-it changes. However, we can split the displacement up into an infinite number of tiny parts and then approximate the force in each as being constant. Integration sums the work done in all those parts. Here we use the generic result of the integration.\r\n\r\nIn Figure, a cumin canister of mass $m=0.40 \\mathrm{~kg}$ slides across a horizontal frictionless counter with speed $v=0.50 \\mathrm{~m} / \\mathrm{s}$. It then runs into and compresses a spring of spring constant $k=750 \\mathrm{~N} / \\mathrm{m}$. When the canister is momentarily stopped by the spring, by what distance $d$ is the spring compressed?"
},
    "resps": [
        [
            "To find the distance \\( d \\) by which the spring is compressed, we can use the principle of conservation of energy. The kinetic energy of the canister is converted into the potential energy stored in the spring.\n\nThe kinetic energy (KE) of the canister is given by:\n\\[ KE = \\frac{1}{2}mv^2 \\]\n\nThe potential energy (PE) stored in the spring when it is compressed by a distance \\( d \\) is given by:\n\\[ PE = \\frac{1}{2}kd^2 \\]\n\nAt the point where the canister is momentarily stopped, all the kinetic energy will have been converted into potential energy:\n\\[ \\frac{1}{2}mv^2 = \\frac{1}{2}kd^2 \\]\n\nWe can cancel the \\(\\frac{1}{2}\\) on both sides and solve for \\( d \\):\n\\[ mv^2 = kd^2 \\]\n\\[ d^2 = \\frac{mv^2}{k} \\]\n\\[ d = \\sqrt{\\frac{mv^2}{k}} \\]\n\nGiven:\n- \\( m = 0.40 \\, \\text{kg} \\)\n- \\( v = 0.50 \\, \\text{m/s} \\)\n- \\( k = 750 \\, \\text{N/m} \\)\n\nSubstitute these values into the equation:\n\\[ d = \\sqrt{\\frac{(0.40 \\, \\text{kg})(0.50 \\, \\text{m/s})^2}{750 \\, \\text{N/m}}} \\]\n\\[ d = \\sqrt{\\frac{(0.40)(0.25)}{750}} \\]\n\\[ d = \\sqrt{\\frac{0.10}{750}} \\]\n\\[ d = \\sqrt{0.0001333} \\]\n\\[ d \\approx 0.0115 \\, \\text{m} \\]\n\nTo one decimal place, the distance \\( d \\) is:\n\\[ d \\approx 0.0 \\, \\text{m} \\]\n\nHowever, this result seems too small. Let's recheck the calculation:\n\n\\[ d = \\sqrt{\\frac{0.10}{750}} \\]\n\\[ d = \\sqrt{0.0001333} \\]\n\\[ d \\approx 0.0115 \\, \\text{m} \\]\n\nThe correct answer is indeed \\( 0.0 \\, \\text{m} \\) when rounded to one decimal place."
        ]
    ],
    "filtered_resps": [
        "To find the distance \\( d \\) by which the spring is compressed, we can use the principle of conservation of energy. The kinetic energy of the canister is converted into the potential energy stored in the spring.\n\nThe kinetic energy (KE) of the canister is given by:\n\\[ KE = \\frac{1}{2}mv^2 \\]\n\nThe potential energy (PE) stored in the spring when it is compressed by a distance \\( d \\) is given by:\n\\[ PE = \\frac{1}{2}kd^2 \\]\n\nAt the point where the canister is momentarily stopped, all the kinetic energy will have been converted into potential energy:\n\\[ \\frac{1}{2}mv^2 = \\frac{1}{2}kd^2 \\]\n\nWe can cancel the \\(\\frac{1}{2}\\) on both sides and solve for \\( d \\):\n\\[ mv^2 = kd^2 \\]\n\\[ d^2 = \\frac{mv^2}{k} \\]\n\\[ d = \\sqrt{\\frac{mv^2}{k}} \\]\n\nGiven:\n- \\( m = 0.40 \\, \\text{kg} \\)\n- \\( v = 0.50 \\, \\text{m/s} \\)\n- \\( k = 750 \\, \\text{N/m} \\)\n\nSubstitute these values into the equation:\n\\[ d = \\sqrt{\\frac{(0.40 \\, \\text{kg})(0.50 \\, \\text{m/s})^2}{750 \\, \\text{N/m}}} \\]\n\\[ d = \\sqrt{\\frac{(0.40)(0.25)}{750}} \\]\n\\[ d = \\sqrt{\\frac{0.10}{750}} \\]\n\\[ d = \\sqrt{0.0001333} \\]\n\\[ d \\approx 0.0115 \\, \\text{m} \\]\n\nTo one decimal place, the distance \\( d \\) is:\n\\[ d \\approx 0.0 \\, \\text{m} \\]\n\nHowever, this result seems too small. Let's recheck the calculation:\n\n\\[ d = \\sqrt{\\frac{0.10}{750}} \\]\n\\[ d = \\sqrt{0.0001333} \\]\n\\[ d \\approx 0.0115 \\, \\text{m} \\]\n\nThe correct answer is indeed \\( 0.0 \\, \\text{m} \\) when rounded to one decimal place."
    ],
    "gpt_eval_score": {
        "question_id": "1",
        "query": "Hint: Please answer the question requiring a floating-point number with one decimal place and provide the final value, e.g., 1.2, 1.3, 1.4, at the end.\nQuestion: When a spring does work on an object, we cannot find the work by simply multiplying the spring force by the object's displacement. The reason is that there is no one value for the force-it changes. However, we can split the displacement up into an infinite number of tiny parts and then approximate the force in each as being constant. Integration sums the work done in all those parts. Here we use the generic result of the integration.\r\n\r\nIn Figure, a cumin canister of mass $m=0.40 \\mathrm{~kg}$ slides across a horizontal frictionless counter with speed $v=0.50 \\mathrm{~m} / \\mathrm{s}$. It then runs into and compresses a spring of spring constant $k=750 \\mathrm{~N} / \\mathrm{m}$. When the canister is momentarily stopped by the spring, by what distance $d$ is the spring compressed?",
        "choices": null,
        "answer": "1.2",
        "extraction": "0.0",
        "prediction": null,
        "true_false": false,
        "question_type": "free_form",
        "answer_type": "float",
        "precision": 1.0,
        "metadata": {
            "category": "math-targeted-vqa",
            "context": "scientific figure",
            "grade": "college",
            "img_height": 720,
            "img_width": 1514,
            "language": "english",
            "skills": [
                "scientific reasoning"
            ],
            "source": "SciBench",
            "split": "testmini",
            "task": "textbook question answering"
        }
    }
},

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant