Releases: keras-team/keras-hub
v0.9.2
Summary
- Initial release of CodeGemma.
- Bump to a Gemma 1.1 version without download issues on Kaggle.
What's Changed
- Fix
print_fn
issue in task test by @SamanehSaadat in #1563 - Update presets for code gemma by @mattdangerw in #1564
- version bump 0.9.2.dev0 by @mattdangerw in #1565
- Version bump 0.9.2 by @mattdangerw in #1566
Full Changelog: v0.9.1...v0.9.2
v0.9.1
Patch fix for bug with stop_token_ids
.
What's Changed
- Fix the new stop_token_ids argument by @mattdangerw in #1558
- Fix tests with the "auto" default for stop token ids by @mattdangerw in #1559
- Version bump for 0.9.1 by @mattdangerw in #1560
Full Changelog: v0.9.0...v0.9.1
v0.9.0
The 0.9.0 release adds new models, hub integrations, and general usability improvements.
Summary
- Added the Gemma 1.1 release.
- Added the Llama 2, BLOOM and ELECTRA models.
- Expose new base classes. Allow
from_preset()
on base classes.keras_nlp.models.Backbone
keras_nlp.models.Task
keras_nlp.models.Classifier
keras_nlp.models.CausalLM
keras_nlp.models.Seq2SeqLM
keras_nlp.models.MaskedLM
- Some initial features for uploading to model hubs.
backbone.save_to_preset
,tokenizer.save_to_preset
,keras_nlp.upload_preset
.from_preset
andupload_preset
now work with the Hugging Face Models Hub.- More features (task saving, lora saving), and full documentation coming soon.
- Numerical fixes for the Gemma model at mixed_bfloat16 precision. Thanks unsloth for catching!
# Llama 2. Needs Kaggle consent and login, see https://github.com/Kaggle/kagglehub
causal_lm = keras_nlp.models.LlamaCausalLM.from_preset(
"llama2_7b_en",
dtype="bfloat16", # Run at half precision for inference.
)
causal_lm.generate("Keras is a", max_length=128)
# Base class usage.
keras_nlp.models.Classifier.from_preset("bert_base_en", num_classes=2)
keras_nlp.models.Tokenizer.from_preset("gemma_2b_en")
keras_nlp.models.CausalLM.from_preset("gpt2_base_en", dtype="mixed_bfloat16")
What's Changed
- Add dtype arg to Gemma HF conversion script by @nkovela1 in #1452
- Fix gemma testing import by @mattdangerw in #1462
- Add docstring for PyTorch conversion script install instructions by @nkovela1 in #1471
- Add an annotation to tests that need kaggle auth by @mattdangerw in #1470
- Fix Mistral memory consumption with JAX and default dtype bug by @tirthasheshpatel in #1460
- Bump the master version to 0.9 by @mattdangerw in #1473
- Pin to TF 2.16 RC0 by @sampathweb in #1478
- Fix gemma rms_normalization's use of epsilon by @cpsauer in #1472
- Add
FalconBackbone
by @SamanehSaadat in #1475 - CI - Add kaggle creds to pull model by @sampathweb in #1459
- bug in example for ReversibleEmbedding by @TheCrazyT in #1484
- doc fix for constrastive sampler by @mattdangerw in #1488
- Remove broken link to masking and padding guide by @mattdangerw in #1487
- Fix a typo in causal_lm_preprocessors by @SamanehSaadat in #1489
- Fix dtype accessors of tasks/backbones by @mattdangerw in #1486
- Auto-labels 'gemma' on 'gemma' issues/PRs. by @shmishra99 in #1490
- Add BloomCausalLM by @abuelnasr0 in #1467
- Remove the bert jupyter conversion notebooks by @mattdangerw in #1492
- Add
FalconTokenizer
by @SamanehSaadat in #1485 - Add
FalconPreprocessor
by @SamanehSaadat in #1498 - Rename 176B presets & Add other presets into bloom_presets.py by @abuelnasr0 in #1496
- Add bloom presets by @abuelnasr0 in #1501
- Create workflow for auto assignment of issues and for stale issues by @sachinprasadhs in #1495
- Update requirements to TF 2.16 by @sampathweb in #1503
- Expose Task and Backbone by @mattdangerw in #1506
- Clean up and add our gemma conversion script by @mattdangerw in #1493
- Don't auto-update JAX GPU by @sampathweb in #1507
- Keep rope at float32 precision by @grasskin in #1497
- Bump the python group with 2 updates by @dependabot in #1509
- Fixes for the LLaMA backbone + add dropout by @tirthasheshpatel in #1499
- Add
LlamaPreprocessor
andLlamaCausalLMPreprocessor
by @tirthasheshpatel in #1511 - Always run the rotary embedding layer in float32 by @tirthasheshpatel in #1508
- CI: Fix psutil - Remove install of Python 3.9 and alias of python3 by @sampathweb in #1514
- Update gemma_backbone.py for sharding config. by @qlzh727 in #1491
- Docs/modelling layers by @mykolaskrynnyk in #1502
- Standardize docstring by @sachinprasadhs in #1516
- Support tokenization of special tokens for word_piece_tokenizer by @abuelnasr0 in #1397
- Upload Model to Kaggle by @SamanehSaadat in #1512
- Add scoring mode to MistralCausalLM by @RyanMullins in #1521
- Add Mistral Instruct V0.2 preset by @tirthasheshpatel in #1520
- Add Tests for Kaggle Upload Validation by @SamanehSaadat in #1524
- Add presets for Electra and checkpoint conversion script by @pranavvp16 in #1384
- Allow saving / loading from Huggingface Hub preset by @Wauplin in #1510
- Stop on multiple end tokens by @grasskin in #1518
- Fix doc:
mistral_base_en
->mistral_7b_en
by @asmith26 in #1528 - Add lora example to GemmaCausalLM docstring by @SamanehSaadat in #1527
- Add LLaMA Causal LM with 7B presets by @tirthasheshpatel in #1526
- Add task base classes; support out of tree library extensions by @mattdangerw in #1517
- Doc fixes by @mattdangerw in #1530
- Run the LLaMA and Mistral RMS Layer Norm in float32 by @tirthasheshpatel in #1532
- Adds score API to GPT-2 by @RyanMullins in #1533
- increase pip timeout to 1000s to avoid connection resets by @sampathweb in #1535
- Adds the score API to LlamaCausalLM by @RyanMullins in #1534
- Implement compute_output_spec() for tokenizers with vocabulary. by @briango28 in #1523
- Remove staggler type annotiations by @mattdangerw in #1536
- Always run SiLU activation in float32 for LLaMA and Mistral by @tirthasheshpatel in #1540
- Bump the python group with 2 updates by @dependabot in #1538
- Disallow saving to preset from keras 2 by @SamanehSaadat in #1545
- Fix the rotary embedding computation in LLaMA by @tirthasheshpatel in #1544
- Fix re-compilation bugs by @mattdangerw in #1541
- Fix preprocessor from_preset bug by @mattdangerw in #1549
- Fix a strange issue with preprocessing layer output types by @mattdangerw in #1550
- Fix lowercase bug in wordpiece tokenizer by @abuelnasr0 in #1543
- Small docs updates by @mattdangerw in #1553
- Add a few new preset for gemma by @mattdangerw in #1556
- Remove the dev prefix for 0.9.0 release by @mattdangerw in #1557
New Contributors
- @cpsauer made their first contribution in #1472
- @SamanehSaadat made their first contribution in #1475
- @TheCrazyT made their first contribution in #1484
- @shmishra99 made their first contribution in #1490
- @sachinprasadhs made their first contribution in #1495
- @mykolaskrynnyk made their first contribution in #1502
- @RyanMullins made their first contribution in #1521
- @Wauplin made their first contribution in #1510
- @asmith26 made their first contribution in #1528
- @briango28 made their first contribution in #1523
Full Changelog: v0.8.2...v0.9.0
v0.8.2
Summary
- Mistral fixes for dtype and memory usage. #1458
What's Changed
- Fix Mistral memory consumption with JAX and default dtype bug by @tirthasheshpatel in #1460
- Version bump for dev release by @mattdangerw in #1474
Full Changelog: v0.8.1...v0.8.2.dev0
v0.8.1
Minor fixes to Kaggle Gemma assets.
What's Changed
- Update to the newest version of Gemma on Kaggle by @mattdangerw in #1454
- Dev release 0.8.1.dev0 by @mattdangerw in #1456
- 0.8.1 version bump by @mattdangerw in #1457
Full Changelog: v0.8.0...v0.8.1
v0.8.0
The 0.8.0 release focuses on generative LLM features in KerasNLP.
Summary
- Added the
Mistral
andGemma
models. - Allow passing
dtype
directly to backbone and task constructors. - Add a settable
sequence_length
property to all preprocessing layers. - Added
enable_lora()
to the backbone class for parameter efficient fine-tuning. - Added layer attributes to backbone models for easier access to model internals.
- Added
AlibiBias
layer.
# Pass dtype to a model.
causal_lm = keras_nlp.MistralCausalLM.from_preset(
"mistral_instruct_7b_en",
dtype="bfloat16"
)
# Settable sequence length property.
causal_lm.preprocessor.sequence_length = 128
# Lora API.
causal_lm.enable_lora(rank=4)
# Easy layer attributes.
for layer in causal_lm.backbone.transformer_layers:
print(layer.count_params())
What's Changed
- Fix test for recent keras 3 change by @mattdangerw in #1400
- Pass less state to jax generate function by @mattdangerw in #1398
- Add llama tokenizer by @mattdangerw in #1401
- Add Bloom Model by @abuelnasr0 in #1382
- Try fixing tests by @mattdangerw in #1411
- Revert "Pass less state to jax generate function (#1398)" by @mattdangerw in #1412
- Bloom tokenizer by @abuelnasr0 in #1403
- Update black formatting by @mattdangerw in #1415
- Add Alibi bias layer by @abuelnasr0 in #1404
- Pin to
tensorflow-hub 0.16.0
to fix CI error by @sampathweb in #1420 - Update TF Text and remove TF Hub deps by @sampathweb in #1423
- Pin Jax Version in GPU CI by @sampathweb in #1430
- Add Bloom preprocessor by @abuelnasr0 in #1424
- Add layer attributes for all functional models by @mattdangerw in #1421
- Allow setting dtype per model by @mattdangerw in #1431
- Add a Causal LM model for Mistral by @tirthasheshpatel in #1429
- Fix bart by @mattdangerw in #1434
- Add a settable property for sequence_length by @mattdangerw in #1437
- Add dependabot to update GH Actions and Python dependencies by @pnacht in #1380
- Bump the github-actions group with 1 update by @dependabot in #1438
- Add 7B presets for Mistral by @tirthasheshpatel in #1436
- Update byte_pair_tokenizer.py to close merges file properly by @divyashreepathihalli in #1440
- bump version to 0.8 by @mattdangerw in #1441
- Update our sampler documentation to reflect usage by @mattdangerw in #1444
- Add Gemma model by @mattdangerw in #1448
- Version bump for dev release by @mattdangerw in #1449
- Version bump to 0.8.0 by @mattdangerw in #1450
New Contributors
- @dependabot made their first contribution in #1438
- @divyashreepathihalli made their first contribution in #1440
Full Changelog: v0.7.0...v0.8.0
v0.17.0.dev0
Summary
- 📢 KerasNLP and KerasCV are now becoming KerasHub 📢. KerasCV and KerasNLP have been consolidated into KerasHub package
- Models available now in KerasHub are albert, bart, bert, bloom, clip, csp_darknet, deberta_v3, deeplab_v3, densenet, distil_bert, efficientnet, electra, f_net, falcon, gemma, gpt2, gpt_neo_x, llama, llama3, mistral, mit, mobilenet, opt, pali_gemma, phi3, resnet, retinanet, roberta, sam, stable_diffusion_3, t5, vae, vgg, vit_det, whisper, xlm_roberta and xlnet.
- A new preprocessor flow has been added for vision and audio models
What's Changed
- Update python version in readme to 3.8 by @haifeng-jin in #618
- Modify our pip install line so we upgrade tf by @mattdangerw in #616
- Use Adam optimizer for quick start by @mattdangerw in #620
- Clean up class name and
self
in calls tosuper()
by @mbrukman in #628 - Update word_piece_tokenizer.py by @ADITYADAS1999 in #617
- Add DeBERTaV3 Conversion Script by @abheesht17 in #633
- Add AlbertTokenizer and AlbertPreprocessor by @abheesht17 in #627
- Create
Backbone
base class by @jbischof in #621 - Add TPU testing by @chenmoneygithub in #591
- Add Base Preprocessor Class by @abheesht17 in #638
- Add keras_nlp.samplers by @chenmoneygithub in #563
- Add ALBERT Backbone by @abheesht17 in #622
- Add a small script to count parameters in our presets by @mattdangerw in #610
- Clean up examples/ directory by @ADITYADAS1999 in #637
- Fix Small BERT Typo by @abheesht17 in #651
- Rename examples/bert -> examples/bert_pretraining by @mattdangerw in #647
- Add FNet Preprocessor by @abheesht17 in #646
- Add FNet Backbone by @abheesht17 in #643
- Small DeBERTa Docstring Fixes by @abheesht17 in #666
- Add Fenced Docstring Testing by @abheesht17 in #640
- Corrected the epsilon value by @soma2000-lang in #665
- Consolidate docstring formatting weirdness in Backbone and Preprocessor base classes by @mattdangerw in #654
- Fix
value_dim
inTransformerDecoder
's cross-attn layer by @abheesht17 in #667 - Add ALBERT Presets by @abheesht17 in #655
- Add Base Task Class by @abheesht17 in #671
- Implement TopP, TopK and Beam samplers by @chenmoneygithub in #652
- Add FNet Presets by @abheesht17 in #659
- Bump the year to 2023 by @mattdangerw in #679
- Add BART Backbone by @abheesht17 in #661
- Handle trainable and name in the backbone base class by @mattdangerw in #680
- Ignore Task Docstring for Testing by @abheesht17 in #683
- Light-weight benchmarking script by @NusretOzates in #664
- Conditionally import tf_text everywhere by @mattdangerw in #684
- Expose
token_embedding
as a Backbone Property by @abheesht17 in #676 - Move
from_preset
to base tokenizer classes by @shivance in #673 - add f_net_classifier and f_net_classifier_test by @ADITYADAS1999 in #670
- import rouge_scorer directly from rouge_score package by @sampathweb in #691
- Fix typo in requirements file juypter -> jupyter by @mattdangerw in #693
- Temporary fix to get nightly green again by @mattdangerw in #696
- GPT2 Text Generation APIs by @chenmoneygithub in #592
- Run keras saving tests on nightly and fix RobertaClassifier test by @mattdangerw in #692
- Speed up pip install keras-nlp; simplify deps by @mattdangerw in #697
- Add
AlbertClassifier
by @shivance in #668 - Make tokenizer, backbone, preprocessor properties settable on base class by @mattdangerw in #700
- Update to latest black by @mattdangerw in #708
- RobertaMaskedLM task and preprocessor by @mattdangerw in #653
- Default compilation for BERT/RoBERTa classifiers by @jbischof in #695
- Add start/end token padding to
GPT2Preprocessor
by @chenmoneygithub in #704 - Don't install tf stable when building our nightly image by @mattdangerw in #711
- Add OPT Backbone and Tokenizer by @mattdangerw in #699
- Small OPT Doc-string Edits by @abheesht17 in #716
- Default compilation other classifiers by @Plutone11011 in #714
- Add BartTokenizer and BART Presets by @abheesht17 in #685
- Add an add_prefix_space Arg in BytePairTokenizer by @shivance in #715
- Opt presets by @mattdangerw in #707
- fix import of tensorflow_text in tf_utils by @sampathweb in #723
- Check for masked token in roberta tokenizer by @mattdangerw in #742
- Improve test coverage for special tokens in model tokenizers by @mattdangerw in #743
- Fix the sampler truncation strategy by @chenmoneygithub in #713
- Add ALBERT Conversion Script by @abheesht17 in #736
- Add FNet Conversion Script by @abheesht17 in #737
- Add BART Conversion Script by @abheesht17 in #739
- Pass Correct LayerNorm Epsilon value to TransformerEncoder in Backbones by @TheAthleticCoder in #731
- Improving the layer Description. by @Neeshamraghav012 in #734
- Adding ragged support to SinePositionEncoding by @apupneja in #751
- Fix trailing space by @mattdangerw in #755
- Adding an AlbertMaskedLM task + Fix Projection layer dimension in MaskedLMHead by @shivance in #725
- New docstring example for TokenAndPosition Embedding layer. by @Neeshamraghav012 in #760
- Add a note for TPU issues for deberta_v3 by @mattdangerw in #758
- Add missing exports to models API by @mattdangerw in #763
- Autogenerate preset table by @Cyber-Machine in #690
- Version bump to 0.5.0 by @mattdangerw in #767
- Adding a FNetMaskedLM task model and preprocessor by @apupneja in #740
- Add a DistilBertMaskedLM task model by @ADITYADAS1999 in #724
- Add cache support to decoding journey by @chenmoneygithub in #745
- Handle [MASK] token in DebertaV3Tokenizer by @abheesht17 in #759
- Update README for 2.4.1 release by @mattdangerw in #757
- Fix typo in test docstring by @jbischof in #791
- Fixed Incorrect Links for FNet and DeBERTaV3 models by @Cyber-Machine in #793
- Patch 1 - doc-string spell fix by @atharvapurdue in #781
- Don't rely on core keras initializer config details by @mattdangerw in #802
- Simplify the cache decoding graph by @mattdangerw in #780
- Fix Fenced Doc-String #782 by @atharvapurdue in #785
- Solve #721 Deberta masklm model by @Plutone11011 in #732
- Add from_config to sampler by @mattdangerw in #803
- BertMaskedLM Task Model and Preprocessor by @Cyber-Machine in #774
- Stop generation once end_t...
v0.7.0
This release integrates KerasNLP and Kaggle Models. KerasNLP models will now work in Kaggle offline notebooks and all assets will quickly attach to a notebook rather than needing a slow download.
Summary
KerasNLP pre-trained models are now all made available through Kaggle Models. You can see all models currently available in both KerasCV and KerasNLP here. Individual model pages will include example usage and a file browser to examine all available assets for a model preset.
This change will not affect the existing usage of from_preset()
. Statement like keras_nlp.models.BertClassifier.from_preset("bert_base_en")
will continue to work and download checkpoints from the Kaggle Models hub.
A note on model saving—for saving support across Keras 2 and Keras 3, we recommend using the new Keras saved model format. You can use model.save('path/to/location.keras')
for a full model and model.save_weights('path/to/location.weights.h5')
for checkpoints. See the Keras saving guide for more details.
What's Changed
- Don't export model internals publicly by @mattdangerw in #1255
- Bump master branch version number to 0.7.0.dev0 by @mattdangerw in #1254
- Fix/allow different encoder and decoder feature dimensions in transformer decoder layer by @ferraric in #1260
- Doc updates to switch branding to Keras 3 by @mattdangerw in #1259
- Remove unused TPU testing for backbones by @mattdangerw in #1266
- Make gelu a function, not a lambda so it can be loaded without safe_mode=False by @calvingiles in #1262
- Update requirements and install instructions for multi-backend keras by @mattdangerw in #1257
- Support Keras 3 installation by @mattdangerw in #1258
- Remove dtensor by @mattdangerw in #1268
- Add a lora dense layer by @mattdangerw in #1263
- Factor out testing routines for models by @mattdangerw in #1269
- Convert T5 to Keras 3 by @nkovela1 in #1274
- Fix missing backticks in DistilBertClassifier docstrings by @Philmod in #1278
- T5 checkpoint conversion with HF by @nkovela1 in #1277
- Use gelu_approximate directly in t5 presets by @mattdangerw in #1284
- Add preset tests and weights URLs by @nkovela1 in #1285
- Support loading keras 3 nightly by @mattdangerw in #1286
- Remove the use of
SentencePieceTrainer
from tests by @tirthasheshpatel in #1283 - Fix XLM-RoBERTa detokenize() by @abheesht17 in #1289
- Correct tie_embedding_weights and add logit checking by @nkovela1 in #1288
- Add detokenize testing for model tokenizers by @mattdangerw in #1290
- Fix Whisper by @abheesht17 in #1287
- Test against Keras 3 by @mattdangerw in #1273
- Support TF_USE_LEGACY_KERAS by @mattdangerw in #1295
- Run workflows with read-only tokens by @pnacht in #1305
- Update CONTRIBUTING.md by @mattdangerw in #1310
- Add GitHub Action for Nightly by @sampathweb in #1309
- Fix the publish to pypi action by @mattdangerw in #1311
- Fix nightly tf failure by @mattdangerw in #1316
- Switch deberta to use the "int" dtype by @mattdangerw in #1315
- Add security policy by @pnacht in #1319
- Fix missing export for reversible embedding by @mattdangerw in #1327
- Add
version
API to keras_nlp by @grasskin in #1324 - Fix Keras 3 version check by @sampathweb in #1328
- Simplify running KerasNLP with Keras 3 by @mattdangerw in #1308
- Fix issues with version by @mattdangerw in #1332
- Fix typo in whisper presets files by @mattdangerw in #1337
ELECTRA
backbone implementation in keras by @pranavvp16 in #1291- Fix t5 tokenizer expected output by @mattdangerw in #1348
- Add init.py for electra by @mattdangerw in #1352
- Remove lora dense for now by @mattdangerw in #1359
- Adds Kokoro Build script for Keras-NLP GPU tests by @sampathweb in #1355
- Fixes GPU Test failures for Keras 3 by @sampathweb in #1361
- Change Continuous config to also run only large tests by @sampathweb in #1362
- ElectraTokenizer by @pranavvp16 in #1357
- Add MistralAI's 7B Transformer as a backbone in KerasNLP Models by @tirthasheshpatel in #1314
- changing pooling output by @mbrhd in #1364
- Add
LlamaBackbone
by @shivance in #1203 - Align pip_build with keras by @sampathweb in #1374
- Remove cloudbuild config by @mattdangerw in #1375
- Fix one last bad preset hash by @mattdangerw in #1381
- Add a tokenizer for the Mistral backbone by @tirthasheshpatel in #1383
- Kaggle Presets by @sampathweb in #1365
- Fix mistral and electra tokenizer to match kaggle changes by @mattdangerw in #1387
- Align requirments with Keras by @sampathweb in #1386
- Add a preprocessor for the Mistral backbone by @tirthasheshpatel in #1385
- Switch to always expect full Kaggle preset handles by @mattdangerw in #1390
New Contributors
- @calvingiles made their first contribution in #1262
- @tirthasheshpatel made their first contribution in #1283
- @pnacht made their first contribution in #1305
- @grasskin made their first contribution in #1324
- @pranavvp16 made their first contribution in #1291
- @mbrhd made their first contribution in #1364
Full Changelog: v0.6.4...v0.7.0
v0.6.4
Summary
This point release simplifies our support for Keras 3 and Keras 2.
- If Keras 2 is installed, KerasNLP will use Keras 2 and TensorFlow.
- If Keras 3 is installed, KerasNLP will use Keras 3 and run on any backend.
If you have any issue installing KerasNLP, please open an issue.
What's Changed
- 0.6.4 cherry picks by @mattdangerw in #1350
- Version bump for 0.6.4.dev0 pre-release by @mattdangerw in #1351
- Version bump for 0.6.4 release by @mattdangerw in #1356
Full Changelog: v0.6.3...v0.6.4
v0.6.3
Summary
This release adds support for running KerasNLP against Keras 3. You can try this today by installing tf-nightly
and tensorflow-text-nightly
.
pip install keras-nlp
pip uninstall -y tensorflow-text tensorflow keras
pip install tensorflow-text-nightly tf-nightly
Otherwise, this release should be a no-op for all users. No new features, no change in default behavior.
Upcoming changes
After the release of Keras 3, we will drop support for running KerasNLP against the Keras Core package (no more import keras_core as keras
), in favor of Keras 3. Keras 3 is the long-term replacement for Keras Core.
What's Changed
- Cherry picks for 0.6.3 by @mattdangerw in #1297
- Version bump 0.6.3 by @mattdangerw in #1298
- Bump the version to 0.6.3.dev1 by @mattdangerw in #1301
- Version bump to 0.6.3 by @mattdangerw in #1302
Full Changelog: v0.6.2...v0.6.3