This is a Python script designed to prepare novel data into training data for LLMs. It realigns the text and overwrites the original files with the processed content.
To use the application, run the main.py
script with the paths to the files you want to process as arguments:
python main.py path_to_jp_file path_to_cn_file
To run the tests for the application, use the pytest
command.