Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dattri/examples] add data cleaning example #158

Merged
merged 3 commits into from
Dec 30, 2024

Conversation

TheaperDeng
Copy link
Collaborator

Description

1. Motivation and Context

Add an example for data selection.

2. Summary of the change

Add an example for data selection. This script find the top-K low-quality data that "harm" the validation performance the most. The script also evaluate the test performance after removing the top-K low-quality data points.

Namespace(method='explicit', device='cuda', seed=0, train_size=10000, val_size=10000, test_size=5000, remove_number=100)
Test loss: 0.39373043179512024                                                                                                                                                        
Accuracy: 88.52                                                                                                                                                                       
Test loss (after): 0.38540247082710266
Accuracy  (after): 88.94

3. What tests have been added/updated for the change?

  • Application test: If you wrote an example for the toolkit, this test should be added.

@TheaperDeng TheaperDeng changed the title [dattri/experiements] add data selection example [dattri/examples] add data selection example Dec 26, 2024
@TheaperDeng TheaperDeng force-pushed the data-selection-example branch from bbde8fd to 9c273b6 Compare December 30, 2024 06:40
@TheaperDeng TheaperDeng changed the title [dattri/examples] add data selection example [dattri/examples] add data cleaning example Dec 30, 2024
Copy link
Collaborator

@tingwl0122 tingwl0122 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@tingwl0122 tingwl0122 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TheaperDeng TheaperDeng merged commit dfdf621 into TRAIS-Lab:main Dec 30, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants