You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I enjoyed reading your paper. I also appreciate that you provided all your code. I suspect that GPT-4 would do a lot better at some of the questions (for example the accessibility questions) if you gave it a few-shot prompt (e.g. a five shot prompt). Did you try this out at all? If so, how well did models do?
The text was updated successfully, but these errors were encountered:
Hello, thanks for your interest in our paper and a great question!
Indeed, GPT-4 will do a lot better if you give few-shot examples. However, giving few-shot examples for theory-of-mind questions means you are trying to make the model directly depend on lower-level processes (e.g., shortcut pattern matching). This is violating the "mentalizing" criteria for ToM validation, which is mentioned in our paper. Hope this helps!
Hi! I enjoyed reading your paper. I also appreciate that you provided all your code. I suspect that GPT-4 would do a lot better at some of the questions (for example the accessibility questions) if you gave it a few-shot prompt (e.g. a five shot prompt). Did you try this out at all? If so, how well did models do?
The text was updated successfully, but these errors were encountered: