You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 5, 2023. It is now read-only.
This webpage has gameplay transcripts for a huge variety of parser based Interactive Fiction games.
With some processing this seems like a good source of training data to me, though some of the edits would probably need to be done manually.
The processing on the game transcripts themselves should include things such as substituting '> x desk' for 'You look at the desk', removing the commands/responses that are unrecognised, and removing the out-of-world about sections of the transcripts.
There's also the issue that IRC chat is interspersed with the game transcript on the page, but this should be easily filterable for someone who knows what they're doing I assume? Or perhaps the webmaster will have separate logs.
With this data I would hope the AI could improve on describing objects/people/places, since IF does a lot of that, while the current data seems to be better at conversations and actions.
I started writing a script to download them and filter out the IRC chat, but I've realised my approach isn't a great one. I'm not super proficient at CLI, so I leave this to more capable hands if there are any takers?
🚀 Feature Request
This webpage has gameplay transcripts for a huge variety of parser based Interactive Fiction games.
With some processing this seems like a good source of training data to me, though some of the edits would probably need to be done manually.
The processing on the game transcripts themselves should include things such as substituting '> x desk' for 'You look at the desk', removing the commands/responses that are unrecognised, and removing the out-of-world about sections of the transcripts.
There's also the issue that IRC chat is interspersed with the game transcript on the page, but this should be easily filterable for someone who knows what they're doing I assume? Or perhaps the webmaster will have separate logs.
With this data I would hope the AI could improve on describing objects/people/places, since IF does a lot of that, while the current data seems to be better at conversations and actions.
I started writing a script to download them and filter out the IRC chat, but I've realised my approach isn't a great one. I'm not super proficient at CLI, so I leave this to more capable hands if there are any takers?
The text was updated successfully, but these errors were encountered: