Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions examples/legacy/run_language_modeling.py
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to remove this example script under legacy and not to handle the removed classes ...

I would love to remove the whole legacy directory as it is mentioned

# Legacy examples

This folder contains examples which are not actively maintained (mostly contributed by the community).

Using these examples together with a recent version of the library usually requires to make small (sometimes big) adaptations to get the scripts working.

May I ? 🙏

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ArthurZucker here as well, but in my opinion it's fine to delete the folder. Everything is very old, so they are not useful examples anyway

Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,7 @@
DataCollatorForPermutationLanguageModeling,
DataCollatorForWholeWordMask,
HfArgumentParser,
LineByLineTextDataset,
LineByLineWithRefDataset,
PreTrainedTokenizer,
TextDataset,
Trainer,
TrainingArguments,
set_seed,
Expand Down
2 changes: 0 additions & 2 deletions examples/pytorch/language-modeling/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@ objectives in our [model summary](https://huggingface.co/transformers/model_summ

There are two sets of scripts provided. The first set leverages the Trainer API. The second set with `no_trainer` in the suffix uses a custom training loop and leverages the 🤗 Accelerate library . Both sets use the 🤗 Datasets library. You can easily customize them to your needs if you need extra processing on your datasets.

**Note:** The old script `run_language_modeling.py` is still available [here](https://github.com/huggingface/transformers/blob/main/examples/legacy/run_language_modeling.py).

The following examples, will run on datasets hosted on our [hub](https://huggingface.co/datasets) or with your own
text files for training and validation. We give examples of both below.

Expand Down
10 changes: 0 additions & 10 deletions src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -375,13 +375,8 @@
_import_structure["data.datasets"] = [
"GlueDataset",
"GlueDataTrainingArguments",
"LineByLineTextDataset",
"LineByLineWithRefDataset",
"LineByLineWithSOPTextDataset",
"SquadDataset",
"SquadDataTrainingArguments",
"TextDataset",
"TextDatasetForNextSentencePrediction",
]
_import_structure["generation"].extend(
[
Expand Down Expand Up @@ -527,13 +522,8 @@
from .data.data_collator import default_data_collator as default_data_collator
from .data.datasets import GlueDataset as GlueDataset
from .data.datasets import GlueDataTrainingArguments as GlueDataTrainingArguments
from .data.datasets import LineByLineTextDataset as LineByLineTextDataset
from .data.datasets import LineByLineWithRefDataset as LineByLineWithRefDataset
from .data.datasets import LineByLineWithSOPTextDataset as LineByLineWithSOPTextDataset
from .data.datasets import SquadDataset as SquadDataset
from .data.datasets import SquadDataTrainingArguments as SquadDataTrainingArguments
from .data.datasets import TextDataset as TextDataset
from .data.datasets import TextDatasetForNextSentencePrediction as TextDatasetForNextSentencePrediction
from .feature_extraction_sequence_utils import SequenceFeatureExtractor as SequenceFeatureExtractor

# Feature Extractor
Expand Down
7 changes: 0 additions & 7 deletions src/transformers/data/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,4 @@
# limitations under the License.

from .glue import GlueDataset, GlueDataTrainingArguments
from .language_modeling import (
LineByLineTextDataset,
LineByLineWithRefDataset,
LineByLineWithSOPTextDataset,
TextDataset,
TextDatasetForNextSentencePrediction,
)
from .squad import SquadDataset, SquadDataTrainingArguments
Loading