Skip to content

Latest commit

 

History

History
29 lines (22 loc) · 589 Bytes

readme.md

File metadata and controls

29 lines (22 loc) · 589 Bytes

@lenml/tokenizer-gpt3

a tokenizer.

based on @lenml/tokenizers

Usage

import { fromPreTrained } from "@lenml/tokenizer-gpt3";

const tokenizer = fromPreTrained();
console.log(
    "encode()",
    tokenizer.encode("Hello, my dog is cute", null, {
        add_special_tokens: true,
    })
);
console.log(
    "_encode_text",
    tokenizer._encode_text("Hello, my dog is cute")
);

Full Tokenizer API

Complete api parameters and usage can be found in transformer.js tokenizers document

License

Apache-2.0