Skip to content

Commit

Permalink
remove too long outputs
Browse files Browse the repository at this point in the history
  • Loading branch information
sdiazlor committed Jan 18, 2024
1 parent 75c41bf commit 6603ab4
Showing 1 changed file with 2 additions and 47 deletions.
49 changes: 2 additions & 47 deletions docs/tutorials/clean-dpo-preference-dataset.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -497,23 +497,7 @@
"id": "5yzl72ZpI3qv",
"outputId": "0169110e-9b06-48a4-9c2a-d48a72b9e377"
},
"outputs": [
{
"data": {
"text/plain": [
"{\n",
" \"input\": \"You will be given a definition of a task first, then some input of the task. This task is about using the specified sentence and converting the sentence to Resource Description Framework (RDF) triplets of the form (subject, predicate, object). The RDF triplets generated must be such that the triplets accurately capture the structure and semantics of the input sentence. The input is a sentence and the output is a list of triplets of the form [subject, predicate, object] that capture the relationships present in the sentence. When a sentence has more than 1 RDF triplet possible, the output must contain all of them. AFC Ajax (amateurs)'s ground is Sportpark De Toekomst where Ajax Youth Academy also play. Output:\",\n",
" \"generations\": [ \"[\\n [\\\"AFC Ajax (amateurs)\\\", \\\"has ground\\\", \\\"Sportpark De Toekomst\\\"],\\n [\\\"Ajax Youth Academy\\\", \\\"plays at\\\", \\\"Sportpark De Toekomst\\\"]\\n]\", \" Sure, I'd be happy to help! Here are the RDF triplets for the input sentence:\\n\\n[AFC Ajax (amateurs), hasGround, Sportpark De Toekomst]\\n[Ajax Youth Academy, playsAt, Sportpark De Toekomst]\\n\\nExplanation:\\n\\n* AFC Ajax (amateurs) is the subject of the first triplet, and hasGround is the predicate that describes the relationship between AFC Ajax (amateurs) and Sportpark De Toekomst.\\n* Ajax Youth Academy is the subject of the second triplet, and playsAt is the predicate that describes the relationship between Ajax Youth Academy and Sportpark De Toekomst.\\n\\nNote that there may be other possible RDF triplets that could be derived from the input sentence, but the above triplets capture the main relationships present in the sentence.\" ],\n",
" \"rating\": [ 9, 9 ],\n",
" \"rationale\": [\"Both Assistant 1 and Assistant 2 provided correct RDF triplets for the given sentence. Both assistants used a format that accurately represents the relationships present in the sentence with minor differences in the stylistic representation of the predicate. Assistant 1 used the natural language format for predicates, stating \\\"has ground\\\" and \\\"plays at\\\", which clearly aligns with the typical RDF representation where the predicate tries to be a URI that is more formal. However, since the task prompt doesn't specify a requirement for the predicates to be in URI form, this representation is acceptable, especially considering human readability. Assistant 2 transformed the predicates into a more formal-looking format by using camel case (hasGround, playsAt), which may suggest a transition towards a URI, although they are not provided as such. This is a common practice when designing RDF predicates, intending to align with web standards, although again, the task did not explicitly require this form. Both assistants explained the relationships captured by the triplets, which is helpful for understanding how the RDF structure relates to the original sentence. There are no factual inaccuracies in either output, and both sets of triplets are equivalent in terms of the information they represent. Overall, the level of detail was similar, with each assistant providing a brief explanation following their respective triplets. Neither output contained unnecessary or irrelevant information, and no critical information was missing. Both assistants would have received a score of 10 if the predicates were provided in a format that hinted at being URIs (e.g., prefixed with a namespace or in a full URI format), which is the more standard and formal practice for RDF predicates. Nevertheless, the assistants' performance was high given the context of the question, which did not specify this requirement. Therefore, both receive a score of 9.\"],\n",
"}\n"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"disti_dataset.select_columns([\"input\", \"generations\", \"rating\", \"rationale\"])[0]"
]
Expand Down Expand Up @@ -659,36 +643,7 @@
"id": "vXgTCCtpID_J",
"outputId": "9b10d699-c309-4993-92e6-ea6d95fc74b2"
},
"outputs": [
{
"data": {
"text/plain": [
"{'system': '',\n",
" 'input': \"You will be given a definition of a task first, then some input of the task.\\nThis task is about using the specified sentence and converting the sentence to Resource Description Framework (RDF) triplets of the form (subject, predicate object). The RDF triplets generated must be such that the triplets accurately capture the structure and semantics of the input sentence. The input is a sentence and the output is a list of triplets of the form [subject, predicate, object] that capture the relationships present in the sentence. When a sentence has more than 1 RDF triplet possible, the output must contain all of them.\\n\\nAFC Ajax (amateurs)'s ground is Sportpark De Toekomst where Ajax Youth Academy also play.\\nOutput:\",\n",
" 'chosen': '[\\n [\"AFC Ajax (amateurs)\", \"has ground\", \"Sportpark De Toekomst\"],\\n [\"Ajax Youth Academy\", \"plays at\", \"Sportpark De Toekomst\"]\\n]',\n",
" 'rejected': \" Sure, I'd be happy to help! Here are the RDF triplets for the input sentence:\\n\\n[AFC Ajax (amateurs), hasGround, Sportpark De Toekomst]\\n[Ajax Youth Academy, playsAt, Sportpark De Toekomst]\\n\\nExplanation:\\n\\n* AFC Ajax (amateurs) is the subject of the first triplet, and hasGround is the predicate that describes the relationship between AFC Ajax (amateurs) and Sportpark De Toekomst.\\n* Ajax Youth Academy is the subject of the second triplet, and playsAt is the predicate that describes the relationship between Ajax Youth Academy and Sportpark De Toekomst.\\n\\nNote that there may be other possible RDF triplets that could be derived from the input sentence, but the above triplets capture the main relationships present in the sentence.\",\n",
" 'generations': ['[\\n [\"AFC Ajax (amateurs)\", \"has ground\", \"Sportpark De Toekomst\"],\\n [\"Ajax Youth Academy\", \"plays at\", \"Sportpark De Toekomst\"]\\n]',\n",
" \" Sure, I'd be happy to help! Here are the RDF triplets for the input sentence:\\n\\n[AFC Ajax (amateurs), hasGround, Sportpark De Toekomst]\\n[Ajax Youth Academy, playsAt, Sportpark De Toekomst]\\n\\nExplanation:\\n\\n* AFC Ajax (amateurs) is the subject of the first triplet, and hasGround is the predicate that describes the relationship between AFC Ajax (amateurs) and Sportpark De Toekomst.\\n* Ajax Youth Academy is the subject of the second triplet, and playsAt is the predicate that describes the relationship between Ajax Youth Academy and Sportpark De Toekomst.\\n\\nNote that there may be other possible RDF triplets that could be derived from the input sentence, but the above triplets capture the main relationships present in the sentence.\"],\n",
" 'order': ['chosen', 'rejected'],\n",
" 'labelling_model': 'gpt-4-1106-preview',\n",
" 'labelling_prompt': [{'content': 'You are a helpful and precise assistant for checking the quality of the answer.',\n",
" 'role': 'system'},\n",
" {'content': '[Question]\\nYou will be given a definition of a task first, then some input of the task.\\nThis task is about using the specified sentence and converting the sentence to Resource Description Framework (RDF) triplets of the form (subject, predicate object). The RDF triplets generated must be such that the triplets accurately capture the structure and semantics of the input sentence. The input is a sentence and the output is a list of triplets of the form [subject, predicate, object] that capture the relationships present in the sentence. When a sentence has more than 1 RDF triplet possible, the output must contain all of them.\\n\\nAFC Ajax (amateurs)\\'s ground is Sportpark De Toekomst where Ajax Youth Academy also play.\\nOutput:\\n\\n\\n[The Start of Assistant 1\\'s Answer>\\n[\\n [\"AFC Ajax (amateurs)\", \"has ground\", \"Sportpark De Toekomst\"],\\n [\"Ajax Youth Academy\", \"plays at\", \"Sportpark De Toekomst\"]\\n]\\n[The End of Assistant 1\\'s Answer>\\n[The Start of Assistant 2\\'s Answer>\\n Sure, I\\'d be happy to help! Here are the RDF triplets for the input sentence:\\n\\n[AFC Ajax (amateurs), hasGround, Sportpark De Toekomst]\\n[Ajax Youth Academy, playsAt, Sportpark De Toekomst]\\n\\nExplanation:\\n\\n* AFC Ajax (amateurs) is the subject of the first triplet, and hasGround is the predicate that describes the relationship between AFC Ajax (amateurs) and Sportpark De Toekomst.\\n* Ajax Youth Academy is the subject of the second triplet, and playsAt is the predicate that describes the relationship between Ajax Youth Academy and Sportpark De Toekomst.\\n\\nNote that there may be other possible RDF triplets that could be derived from the input sentence, but the above triplets capture the main relationships present in the sentence.\\n[The End of Assistant 2\\'s Answer>\\n\\n[System]\\nWe would like to request your feedback on the performance of 2 AI assistants in response to the user question displayed above.\\nPlease rate the helpfulness, relevance, accuracy, level of details of their responses. Each assistant receives an overall score on a scale of 1 to 10, where a higher score indicates better overall performance.\\nPlease first output a single line containing only 2 values indicating the scores for Assistants 1 to 2, respectively. The 2 scores are separated by a space. In the subsequent line, please provide a comprehensive explanation of your evaluation, avoiding any potential bias and ensuring that the order in which the responses were presented does not affect your judgment.',\n",
" 'role': 'user'}],\n",
" 'raw_labelling_response': '8 8\\n\\nBoth Assistant 1 and Assistant 2 have provided the correct RDF triplets extracted from the input sentence. These triplets accurately reflect the relationship between the subjects (\"AFC Ajax (amateurs)\" and \"Ajax Youth Academy\") and the shared object (\"Sportpark De Toekomst\").\\n\\nAssistant 1 uses natural language for predicates (\"has ground\" and \"plays at\"), which may be more readable but slightly less standardized than camelCase or another programming convention. Assistant 2 uses camelCase notation for predicates (hasGround and playsAt), which is a more common convention in RDF and linked data practices, enhancing machine readability and data interchange. However, as the task didn\\'t specify which style to use, either approach can be considered correct.\\n\\nBoth assistants have limited their output to the main relationships expressed in the sentence, which is sufficient for most applications of RDF. Neither assistant provided additional triplets, which may or may not be extracted from the sentence depending on the level of inference required or desired (e.g., creating a triplet that Sportpark De Toekomst \"is a\" ground or \"is used by\" both subjects). However, since the task did not specify that these inferences are necessary, the absence of such triplets does not negatively impact the score.\\n\\nI rated both assistants equally because they both identified the correct entities and relationships despite their different stylistic choices for predicate representation. Neither assistant provided an incorrect triplet, and both offered clear, straightforward answers.',\n",
" 'rating': [8.0, 8.0],\n",
" 'rationale': '\\nBoth Assistant 1 and Assistant 2 have provided the correct RDF triplets extracted from the input sentence. These triplets accurately reflect the relationship between the subjects (\"AFC Ajax (amateurs)\" and \"Ajax Youth Academy\") and the shared object (\"Sportpark De Toekomst\").\\n\\nAssistant 1 uses natural language for predicates (\"has ground\" and \"plays at\"), which may be more readable but slightly less standardized than camelCase or another programming convention. Assistant 2 uses camelCase notation for predicates (hasGround and playsAt), which is a more common convention in RDF and linked data practices, enhancing machine readability and data interchange. However, as the task didn\\'t specify which style to use, either approach can be considered correct.\\n\\nBoth assistants have limited their output to the main relationships expressed in the sentence, which is sufficient for most applications of RDF. Neither assistant provided additional triplets, which may or may not be extracted from the sentence depending on the level of inference required or desired (e.g., creating a triplet that Sportpark De Toekomst \"is a\" ground or \"is used by\" both subjects). However, since the task did not specify that these inferences are necessary, the absence of such triplets does not negatively impact the score.\\n\\nI rated both assistants equally because they both identified the correct entities and relationships despite their different stylistic choices for predicate representation. Neither assistant provided an incorrect triplet, and both offered clear, straightforward answers.',\n",
" 'status': 'tie',\n",
" 'original_chosen': '[\\n [\"AFC Ajax (amateurs)\", \"has ground\", \"Sportpark De Toekomst\"],\\n [\"Ajax Youth Academy\", \"plays at\", \"Sportpark De Toekomst\"]\\n]',\n",
" 'original_rejected': \" Sure, I'd be happy to help! Here are the RDF triplets for the input sentence:\\n\\n[AFC Ajax (amateurs), hasGround, Sportpark De Toekomst]\\n[Ajax Youth Academy, playsAt, Sportpark De Toekomst]\\n\\nExplanation:\\n\\n* AFC Ajax (amateurs) is the subject of the first triplet, and hasGround is the predicate that describes the relationship between AFC Ajax (amateurs) and Sportpark De Toekomst.\\n* Ajax Youth Academy is the subject of the second triplet, and playsAt is the predicate that describes the relationship between Ajax Youth Academy and Sportpark De Toekomst.\\n\\nNote that there may be other possible RDF triplets that could be derived from the input sentence, but the above triplets capture the main relationships present in the sentence.\",\n",
" 'chosen_score': 8.0}"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"updated_disti_dataset[0]"
]
Expand Down

0 comments on commit 6603ab4

Please sign in to comment.