Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/core/whats-new/dotnet-8/runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -500,11 +500,11 @@ IDataView predictions = model.Transform(split.TestSet);

.NET 8 introduces several new types aimed at improving app performance.

- The new <xref:System.Collections.Frozen?displayProperty=fullName> namespace includes the collection types <xref:System.Collections.Frozen.FrozenDictionary%602> and <xref:System.Collections.Frozen.FrozenSet%601>. These types don't allow any changes to keys and values once a collection created. That requirement allows faster read operations (for example, `TryGetValue()`). These types are particularly useful for collections that are populated on first use and then persisted for the duration of a long-lived service, for example:
- The new <xref:System.Collections.Frozen?displayProperty=fullName> namespace includes the collection types <xref:System.Collections.Frozen.FrozenDictionary%602> and <xref:System.Collections.Frozen.FrozenSet%601>. These types don't allow any changes to keys and values once a collection is created. That requirement allows faster read operations (for example, `TryGetValue()`). These types are particularly useful for collections that are populated on first use and then persisted for the duration of a long-lived service, for example:

```csharp
private static readonly FrozenDictionary<string, bool> s_configurationData =
LoadConfigurationData().ToFrozenDictionary(optimizeForReads: true);
LoadConfigurationData().ToFrozenDictionary();

// ...
if (s_configurationData.TryGetValue(key, out bool setting) && setting)
Expand Down
6 changes: 3 additions & 3 deletions docs/core/whats-new/dotnet-9/libraries.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@
title: What's new in .NET libraries for .NET 9
description: Learn about the new .NET libraries features introduced in .NET 9.
titleSuffix: ""
ms.date: 09/09/2024
ms.date: 10/08/2024
ms.topic: whats-new
---

# What's new in .NET libraries for .NET 9

This article describes new features in the .NET libraries for .NET 9. It's been updated for .NET 9 RC 1.
This article describes new features in the .NET libraries for .NET 9. It's been updated for .NET 9 RC 2.

## Base64Url

Expand Down Expand Up @@ -44,7 +44,7 @@ The following example demonstrates using [Dictionary<TKey,TValue>.GetAlternateLo

### `OrderedDictionary<TKey, TValue>`

In many scenarios, you might want to store key-value pairs in a way where order can be maintained (a list of key-value pairs) but where fast lookup by key is also supported (a dictionary of key-value pairs). Since the early days of .NET, the <xref:System.Collections.Specialized.OrderedDictionary> type has supported this scenario, but only in a non-generic manner, with keys and values typed as <xref:System.Collections.Generic.OrderedDictionary%602> collection, which provides an efficient, generic type to support these scenarios.
In many scenarios, you might want to store key-value pairs in a way where order can be maintained (a list of key-value pairs) but where fast lookup by key is also supported (a dictionary of key-value pairs). Since the early days of .NET, the <xref:System.Collections.Specialized.OrderedDictionary> type has supported this scenario, but only in a non-generic manner, with keys and values typed as `object`. .NET 9 introduces the long-requested <xref:System.Collections.Generic.OrderedDictionary%602> collection, which provides an efficient, generic type to support these scenarios.

The following code uses the new class.

Expand Down
2 changes: 1 addition & 1 deletion docs/core/whats-new/dotnet-9/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: What's new in .NET 9
description: Learn about the new .NET features introduced in .NET 9 for the runtime, libraries, and SDK. Also find links to what's new in other areas, such as ASP.NET Core.
titleSuffix: ""
ms.date: 09/10/2024
ms.date: 10/08/2024
ms.topic: whats-new
---

Expand Down
4 changes: 2 additions & 2 deletions docs/core/whats-new/dotnet-9/runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
title: What's new in .NET 9 runtime
description: Learn about the new .NET features introduced in the .NET 9 runtime.
titleSuffix: ""
ms.date: 09/09/2024
ms.date: 10/08/2024
ms.topic: whats-new
---
# What's new in the .NET 9 runtime

This article describes new features and performance improvements in the .NET runtime for .NET 9. It's been updated for .NET 9 Preview 7.
This article describes new features and performance improvements in the .NET runtime for .NET 9. It's been updated for .NET 9 RC 2.

## Attribute model for feature switches with trimming support

Expand Down
4 changes: 2 additions & 2 deletions docs/core/whats-new/dotnet-9/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@
title: What's new in the SDK for .NET 9
description: Learn about the new .NET SDK features introduced in .NET 9, including for unit testing, terminal logger, tool roll-forward, and build script analyzers.
titleSuffix: ""
ms.date: 09/09/2024
ms.date: 10/08/2024
ms.topic: whats-new
---

# What's new in the SDK for .NET 9

This article describes new features in the .NET SDK for .NET 9. It's been updated for .NET RC 1.
This article describes new features in the .NET SDK for .NET 9. It's been updated for .NET RC 2.

## Unit testing

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ public static void RunIt()
using Stream vocabStream = File.OpenRead(phi2VocabPath);
using Stream mergesStream = File.OpenRead(phi2MergePath);

Tokenizer phi2Tokenizer = Tokenizer.CreateCodeGen(vocabStream, mergesStream);
Tokenizer phi2Tokenizer = CodeGenTokenizer.Create(vocabStream, mergesStream);
IReadOnlyList<int> ids = phi2Tokenizer.EncodeToIds("Hello, World");
// </CodeGen>
}
Expand Down
24 changes: 9 additions & 15 deletions docs/machine-learning/whats-new/snippets/csharp/Llama.cs
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
using Microsoft.ML.Tokenizers;
using System;
using System;
using System.Collections.Generic;
using System.IO;
using System.Net.Http;
using System.Linq;
using Microsoft.ML.Tokenizers;

internal class Llama
{
Expand All @@ -12,7 +12,7 @@ public static void RunIt()
// Create the Tokenizer.
string modelUrl = @"https://huggingface.co/hf-internal-testing/llama-llamaTokenizer/resolve/main/llamaTokenizer.model";
using Stream remoteStream = File.OpenRead(modelUrl);
Tokenizer llamaTokenizer = Tokenizer.CreateLlama(remoteStream);
Tokenizer llamaTokenizer = LlamaTokenizer.Create(remoteStream);

string text = "Hello, World!";

Expand All @@ -32,27 +32,21 @@ public static void RunIt()
// idsCount = 5

// Full encoding.
EncodingResult result = llamaTokenizer.Encode(text);
Console.WriteLine($"result.Tokens = {{'{string.Join("', '", result.Tokens)}'}}");
IReadOnlyList<EncodedToken> result = llamaTokenizer.EncodeToTokens(text, out string? normalizedString);
Console.WriteLine($"result.Tokens = {{'{string.Join("', '", result.Select(t => t.Value))}'}}");
// result.Tokens = {'<s>', '▁Hello', ',', '▁World', '!'}
Console.WriteLine($"result.Offsets = {{{string.Join(", ", result.Offsets)}}}");
// result.Offsets = {(0, 0), (0, 6), (6, 1), (7, 6), (13, 1)}
Console.WriteLine($"result.Ids = {{{string.Join(", ", result.Ids)}}}");
Console.WriteLine($"result.Ids = {{{string.Join(", ", result.Select(t => t.Id))}}}");
// result.Ids = {1, 15043, 29892, 2787, 29991}

// Encode up 2 tokens.
int index1 = llamaTokenizer.IndexOfTokenCount(text, maxTokenCount: 2, out string processedText1, out int tokenCount1);
Console.WriteLine($"processedText1 = {processedText1}");
// processedText1 = ▁Hello,▁World!
int index1 = llamaTokenizer.GetIndexByTokenCount(text, maxTokenCount: 2, out string? processedText1, out int tokenCount1);
Console.WriteLine($"tokenCount1 = {tokenCount1}");
// tokenCount1 = 2
Console.WriteLine($"index1 = {index1}");
// index1 = 6

// Encode from end up to one token.
int index2 = llamaTokenizer.LastIndexOfTokenCount(text, maxTokenCount: 1, out string processedText2, out int tokenCount2);
Console.WriteLine($"processedText2 = {processedText2}");
// processedText2 = ▁Hello,▁World!
int index2 = llamaTokenizer.GetIndexByTokenCountFromEnd(text, maxTokenCount: 1, out string? processedText2, out int tokenCount2);
Console.WriteLine($"tokenCount2 = {tokenCount2}");
// tokenCount2 = 1
Console.WriteLine($"index2 = {index2}");
Expand Down
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
Tiktoken.RunIt();
//Tiktoken.RunIt();
Llama.RunIt();
23 changes: 9 additions & 14 deletions docs/machine-learning/whats-new/snippets/csharp/Tiktoken.cs
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML.Tokenizers;

internal class Tiktoken
{
public static void RunIt()
{
// <Tiktoken>
Tokenizer tokenizer = Tokenizer.CreateTiktokenForModel("gpt-4");
Tokenizer tokenizer = TiktokenTokenizer.CreateForModel("gpt-4");
string text = "Hello, World!";

// Encode to IDs.
Expand All @@ -26,36 +27,30 @@ public static void RunIt()
// idsCount = 4

// Full encoding.
EncodingResult result = tokenizer.Encode(text);
Console.WriteLine($"result.Tokens = {{'{string.Join("', '", result.Tokens)}'}}");
IReadOnlyList<EncodedToken> result = tokenizer.EncodeToTokens(text, out string? normalizedString);
Console.WriteLine($"result.Tokens = {{'{string.Join("', '", result.Select(t => t.Value))}'}}");
// result.Tokens = {'Hello', ',', ' World', '!'}
Console.WriteLine($"result.Offsets = {{{string.Join(", ", result.Offsets)}}}");
// result.Offsets = {(0, 5), (5, 1), (6, 6), (12, 1)}
Console.WriteLine($"result.Ids = {{{string.Join(", ", result.Ids)}}}");
Console.WriteLine($"result.Ids = {{{string.Join(", ", result.Select(t => t.Id))}}}");
// result.Ids = {9906, 11, 4435, 0}

// Encode up to number of tokens limit.
int index1 = tokenizer.IndexOfTokenCount(
int index1 = tokenizer.GetIndexByTokenCount(
text,
maxTokenCount: 1,
out string processedText1,
out string? processedText1,
out int tokenCount1
); // Encode up to one token.
Console.WriteLine($"processedText1 = {processedText1}");
// processedText1 = Hello, World!
Console.WriteLine($"tokenCount1 = {tokenCount1}");
// tokenCount1 = 1
Console.WriteLine($"index1 = {index1}");
// index1 = 5

int index2 = tokenizer.LastIndexOfTokenCount(
int index2 = tokenizer.GetIndexByTokenCountFromEnd(
text,
maxTokenCount: 1,
out string processedText2,
out string? processedText2,
out int tokenCount2
); // Encode from end up to one token.
Console.WriteLine($"processedText2 = {processedText2}");
// processedText2 = Hello, World!
Console.WriteLine($"tokenCount2 = {tokenCount2}");
// tokenCount2 = 1
Console.WriteLine($"index2 = {index2}");
Expand Down
Loading