Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

计划将所有句料集中于sentences.tsv #40

Open
Guanchishan opened this issue Nov 25, 2023 · 1 comment
Open

计划将所有句料集中于sentences.tsv #40

Guanchishan opened this issue Nov 25, 2023 · 1 comment
Assignees

Comments

@Guanchishan
Copy link
Member

Guanchishan commented Nov 25, 2023

做法: 维持现有目录结构的同时,将所有句子都在 sentences.tsv 里放一份,并在 Path to Original File 栏位填写源文件链接。所有对句料的标注优先执行于sentences.tsv,源文件那边则按需更新。

好处:

  • 榕典正字法更新后,不用翻查半天repo跟着改字,在 sentences.tsv 一站式修改即可。源文件那边只作为一种“原本”的参考。
  • 如果要调用文字语料,直接从 sentences.tsv 一处索取即可。

坏处:

  • 不优雅(?)
  • 待补充
Guanchishan added a commit that referenced this issue Nov 25, 2023
@Guanchishan Guanchishan self-assigned this Nov 25, 2023
@Guanchishan
Copy link
Member Author

改变主意了。根目录的sentences应该由机器生成,然后日常的新建与维护在各自目录的各自文件里进行

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant