RougeSimilarityFilter Should Consider Input Field for Filtering #1625
Unanswered
coolbeevip
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm encountering an issue while using the SelfInstructPipeline for data generation. My goal is to keep the instruction field constant while generating diverse input and output pairs. However, it appears that the RougeSimilarityFilter is only being applied to the values within the instruction field. This prevents me from generating new data and results in repetitive outputs.
Here's an example of my data samples:
I understand that disabling the RougeSimilarityFilter would bypass this issue. However, I would prefer for it to also consider the input field during the filtering process. This would allow for more varied data generation while still maintaining control over similarity.
Furthermore, I'd like to suggest the addition of a system field to define roles or contexts. For example:
I noticed that Alpaca does not define a
system
field (https://github.com/tatsu-lab/stanford_alpaca). Could you provide any insights or suggestions on how to address my issue with theRougeSimilarityFilter
and whether the addition of asystem
field is a feasible or recommended approach?Beta Was this translation helpful? Give feedback.
All reactions