Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Fine-Tuning: Incorrect Answers When Querying Specific Questions #311

Open
dikyridhlo opened this issue Aug 30, 2024 · 2 comments

Comments

@dikyridhlo
Copy link

I recently fine-tuned a model Donut for DocVQA. The fine-tuning process completed successfully, but I encountered an issue during inference. When I ask a question that should correspond to a specific answer in the ground truth, the model often returns a different answer.
For example, my dataset contains the following entry:
image
upload.wikimedia.org/wikipedia/commons/f/f5/Florida_Driver_License.png (its just example)

json

"gt_parses": [ {"question": "What is the Driver's License Number?", "answer": "DL1234567890"}, {"question": "What is the Full Name?", "answer": "John Doe"}, {"question": "What is the Date of Birth?", "answer": "March 10, 1985"}, {"question": "What is the Address?", "answer": "123 Elm Street, Springfield, IL 62704, United States"}, {"question": "What is the Expiration Date?", "answer": "March 10, 2025"} ]

However, when I query "What is the Full Name?", the model incorrectly responds with the “Driver's License Number” instead of the name.

i have been try:

  • move the full name on the first array, but the answer is still Driver License Number
  • change the metadata.csv to jsonl (its return out of memory)
  • create only 1 question answer inside array json, its work. but when i try fine-tuning other question its break. for example:
  • 1st fine-tuning i use all driver license number, i ask number inside card and its answered
  • 2nd fine-tuning i add full name, but when i ask its return wrong answer. the answer always driver license number

here the sequences i created

def _prepare_gt_sequence(self, gt_parses):
        sequences = []
        for parse in gt_parses:
            question = parse.get("question", "")
            answer = parse.get("answer", "")
            sequence = f"<s>{question}</s> {answer}<eos>"
            sequences.append(sequence)
        return sequences[0] if sequences else "<s><eos>"

Could you please provide guidance on why this might be happening and how I can resolve it? Any suggestions on improving the model's accuracy for this kind of task would be greatly appreciated.
Thank you for your assistance!

@cccccckt
Copy link

cccccckt commented Sep 2, 2024

Hi, Could I see the specific format of the metadata.jsonl file for your training dataset? Or Could you consider sending it to my email address.?When I was fine-tuning, I found that the input of the model contained the answer part, not sure what is the problem
https://github.com/clovaai/donut/issues/312#issue-2501078667

@dikyridhlo
Copy link
Author

hi @cccccckt , thank you for the answer.
i use something like this on jsonl (its just example)

{ "file_name": "result-0812602293091518.png", "ground_truth": { "gt_parse": { "license_number": "DL1234567890", "name": "John Doe", "address": "123 Elm Street, Springfield, IL 62704, United States", "date_of_birth": "March 10, 1985", "expiration_date": "March 10, 2025" } } }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants