Fine-tuning on conversations (format of conversations) #248

Eichhof · 2023-01-14T11:16:32Z

Hello

I have a dataset consisting of dialogues between two people which I would like to use for fine-tuning GPT-J. Please see below for two example dialogues. The dialogues vary in length and can be longer than the examples.

Is the format of the conversations ok? For fine-tuning, should I just concatenate all conversations into one big file or do I have to use a separator between the conversations (if yes, which separator)?

First Dialogue:

user1:
Hey there. What’s up?

user2:
Not much, just hanging out. What about you?

user1:
Just thinking about what I’m going to do this weekend. You?

user2:
Probably just relaxing. What do you have planned?

user1:
I’m thinking about going to the beach. It’s supposed to be nice this weekend.

user2:
That sounds like a great plan! Have you been to the beach recently?

user1:
Not in a while. It would be nice to get out and enjoy the sun.

user2:
Definitely! I’m sure it’ll be a great time. Do you have any other ideas for the weekend?

Second Dialgoue:

user1:
Good morning. What is your profession?

user2:
Good morning. I’m an accountant. What about you?

user1:
I’m a software engineer. How long have you been an accountant?

user2:
I’ve been an accountant for about five years now. What about you? How long have you been a software engineer?

user1:
I’ve been a software engineer for three years. What do you like most about accounting?

user2:
I like how challenging it can be. There’s always something to learn or something new to figure out. What do you like most about software engineering?

user1:
I like how creative it can be. I get to come up with new ideas and new ways of solving problems. It’s a great feeling when you can come up with something that works.

mosmos6 · 2023-01-18T08:03:39Z

@Eichhof
I use gpt-j as a chat bot, but I haven't needed to finetune it with dialogue templates.
What I could say is

Spaces and new lines can confuse gpt-j. eg,

user1:Hey there. What’s up?
user2:Not much, just hanging out. What about you?
might be better.

Gpt-j can have difficulties to distinguish pronoun and proper noun eg, it can think "you" is someone's name. so giving some specific names could be better.
Gpt-j has a tendency to try to provoke us (users, humans). It recognizes cliche and try to jump out to utterly unexpected context.

I wish you the best of luck and if possible, I wish you could share a part of your results if it doesn't mess this thread.

krisbianprabowo · 2023-01-26T07:01:44Z

@Eichhof I use gpt-j as a chat bot, but I haven't needed to finetune it with dialogue templates. What I could say is

I wish you the best of luck and if possible, I wish you could share a part of your results if it doesn't mess this thread.

Hello, I'm still relatively new here in gpt-j. I tried to run the Colab Demo to do some inferences, especially for a chatbot use case. I don't have any idea how to stop the models from generating a new tokens after bot end up answering. In GPT3 we can easily insert a Stop Sequences or the model already good enough to know when to stop. Looks like set the "gen_len" parameter also not works.

Do you have any idea for this?

I included my example prompt and the result below:

Eichhof changed the title ~~Fine-tuning on conversations~~ Fine-tuning on conversations (format of conversations) Jan 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tuning on conversations (format of conversations) #248

Fine-tuning on conversations (format of conversations) #248

Eichhof commented Jan 14, 2023

mosmos6 commented Jan 18, 2023

krisbianprabowo commented Jan 26, 2023 •

edited

Loading

Fine-tuning on conversations (format of conversations) #248

Fine-tuning on conversations (format of conversations) #248

Comments

Eichhof commented Jan 14, 2023

mosmos6 commented Jan 18, 2023

krisbianprabowo commented Jan 26, 2023 • edited Loading

krisbianprabowo commented Jan 26, 2023 •

edited

Loading