You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is how I implemented the benchmark solutions:
First extract solutions from the Python version
Rewrite the Python code into the Julia version.
Rewrite test cases based on my understanding if possible.
Once finished, I evaluated my first attempts all at once. And the pass rate in average is slight above 0.5. So I would say, many LLMs already did it better then me.
You might be surprised that my pass rate is way too low. Honestly speaking, I was also quite surprised, especially given that I've been programming in Julia for years and I had peeked the Python version solution. I found that the failed ones were mainly due to corner cases, incorrect grammar and misunderstanding of problems. Actually the Python version solutions sometimes were quite misleading.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
This is how I implemented the benchmark solutions:
Once finished, I evaluated my first attempts all at once. And the pass rate in average is slight above 0.5. So I would say, many LLMs already did it better then me.
You might be surprised that my pass rate is way too low. Honestly speaking, I was also quite surprised, especially given that I've been programming in Julia for years and I had peeked the Python version solution. I found that the failed ones were mainly due to corner cases, incorrect grammar and misunderstanding of problems. Actually the Python version solutions sometimes were quite misleading.
Beta Was this translation helpful? Give feedback.
All reactions