Issue with Fine-Tuning: Incorrect Answers When Querying Specific Questions #311

dikyridhlo · 2024-08-30T12:57:16Z

I recently fine-tuned a model Donut for DocVQA. The fine-tuning process completed successfully, but I encountered an issue during inference. When I ask a question that should correspond to a specific answer in the ground truth, the model often returns a different answer.
For example, my dataset contains the following entry:
image
upload.wikimedia.org/wikipedia/commons/f/f5/Florida_Driver_License.png (its just example)

json

"gt_parses": [ {"question": "What is the Driver's License Number?", "answer": "DL1234567890"}, {"question": "What is the Full Name?", "answer": "John Doe"}, {"question": "What is the Date of Birth?", "answer": "March 10, 1985"}, {"question": "What is the Address?", "answer": "123 Elm Street, Springfield, IL 62704, United States"}, {"question": "What is the Expiration Date?", "answer": "March 10, 2025"} ]

However, when I query "What is the Full Name?", the model incorrectly responds with the “Driver's License Number” instead of the name.

i have been try:

move the full name on the first array, but the answer is still Driver License Number
change the metadata.csv to jsonl (its return out of memory)
create only 1 question answer inside array json, its work. but when i try fine-tuning other question its break. for example:
1st fine-tuning i use all driver license number, i ask number inside card and its answered
2nd fine-tuning i add full name, but when i ask its return wrong answer. the answer always driver license number

here the sequences i created

def _prepare_gt_sequence(self, gt_parses):
        sequences = []
        for parse in gt_parses:
            question = parse.get("question", "")
            answer = parse.get("answer", "")
            sequence = f"<s>{question}</s> {answer}<eos>"
            sequences.append(sequence)
        return sequences[0] if sequences else "<s><eos>"

Could you please provide guidance on why this might be happening and how I can resolve it? Any suggestions on improving the model's accuracy for this kind of task would be greatly appreciated.
Thank you for your assistance!

The text was updated successfully, but these errors were encountered:

cccccckt · 2024-09-02T14:21:46Z

Hi, Could I see the specific format of the metadata.jsonl file for your training dataset? Or Could you consider sending it to my email address.?When I was fine-tuning, I found that the input of the model contained the answer part, not sure what is the problem
https://github.com/clovaai/donut/issues/312#issue-2501078667

dikyridhlo · 2024-09-13T01:45:43Z

hi @cccccckt , thank you for the answer.
i use something like this on jsonl (its just example)

{ "file_name": "result-0812602293091518.png", "ground_truth": { "gt_parse": { "license_number": "DL1234567890", "name": "John Doe", "address": "123 Elm Street, Springfield, IL 62704, United States", "date_of_birth": "March 10, 1985", "expiration_date": "March 10, 2025" } } }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Fine-Tuning: Incorrect Answers When Querying Specific Questions #311

Issue with Fine-Tuning: Incorrect Answers When Querying Specific Questions #311

dikyridhlo commented Aug 30, 2024

cccccckt commented Sep 2, 2024

dikyridhlo commented Sep 13, 2024

Issue with Fine-Tuning: Incorrect Answers When Querying Specific Questions #311

Issue with Fine-Tuning: Incorrect Answers When Querying Specific Questions #311

Comments

dikyridhlo commented Aug 30, 2024

cccccckt commented Sep 2, 2024

dikyridhlo commented Sep 13, 2024