Skip to content

Commit

Permalink
New dict
Browse files Browse the repository at this point in the history
  • Loading branch information
AG committed Apr 30, 2024
1 parent e29e383 commit c584d15
Show file tree
Hide file tree
Showing 5 changed files with 2 additions and 2 deletions.
Binary file modified backup/dictionary.pkl
Binary file not shown.
2 changes: 1 addition & 1 deletion backup/processing_progress.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
25571507,23599999999
26134921,24119999999
Binary file modified dictionary.msgpack
Binary file not shown.
Binary file modified tokens.msgpack
Binary file not shown.
2 changes: 1 addition & 1 deletion train.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ async def main(retain=False):
tree_store = await save_position('training/processing_progress.txt', i + start_position + 1, word_count, tree_store)
gc.collect()

if (word_count + 1) % (PRUNE_FREQUENCY * 20) == 0:
if (word_count + 1) % (PRUNE_FREQUENCY * 10) == 0:
merge_batches()

await create_batch(tree_store, TARGET_DICTIONARY_COUNT)
Expand Down

0 comments on commit c584d15

Please sign in to comment.