Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Make openai_realtime_dart client strong-typed #590

Merged
merged 1 commit into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion packages/openai_dart/oas/main.dart
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import 'dart:io';

import 'package:openapi_spec/openapi_spec.dart';

/// Generates Chroma API client Dart code from the OpenAPI spec.
/// Generates OpenAI API client Dart code from the OpenAPI spec.
/// Official spec: https://github.com/openai/openai-openapi/blob/master/openapi.yaml
void main() async {
final spec = OpenApi.fromFile(source: 'oas/openapi_curated.yaml');
Expand Down
188 changes: 90 additions & 98 deletions packages/openai_realtime_dart/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,28 +36,33 @@ final client = RealtimeClient(
);

// Can set parameters ahead of connecting, either separately or all at once
client.updateSession(instructions: 'You are a great, upbeat friend.');
client.updateSession(voice: 'alloy');
client.updateSession(
turnDetection: {'type': 'none'},
inputAudioTranscription: {'model': 'whisper-1'},
await client.updateSession(instructions: 'You are a great, upbeat friend.');
await client.updateSession(voice: Voice.alloy);
await client.updateSession(
turnDetection: TurnDetection(
type: TurnDetectionType.serverVad,
),
inputAudioTranscription: InputAudioTranscriptionConfig(
model: 'whisper-1',
),
);

// Set up event handling
client.on('conversation.updated', (event) {
client.on(RealtimeEventType.conversationUpdated, (event) {
// item is the current item being updated
final item = event?['item'];
// delta can be null or populated
final delta = event?['delta'];
final result = (event as RealtimeEventConversationUpdated).result;
final item = result.item;
final delta = result.delta;
// you can fetch a full list of items at any time
final items = client.conversation.getItems();
});

// Connect to Realtime API
await client.connect();

// Send a item and triggers a generation
client.sendUserMessageContent([
{'type': 'input_text', 'text': 'How are you?'},
await client.sendUserMessageContent([
const ContentPart.inputText(text: 'How are you?'),
]);
```

Expand Down Expand Up @@ -94,7 +99,7 @@ In this package, there are three primitives for interfacing with the Realtime AP
- Thin wrapper over [WebSocket](https://developer.mozilla.org/en-US/docs/Web/API/WebSocket)
- Use this for connecting to the API, authenticating, and sending items
- There is **no item validation**, you will have to rely on the API specification directly
- Dispatches events as `server.{event_name}` and `client.{event_name}`, respectively
- Dispatches events according to the `RealtimeEventType` enum
3. [`RealtimeConversation`](./lib/src/conversation.dart)
- Exists on client instance as `client.conversation`
- Stores a client-side cache of the current conversation
Expand All @@ -109,18 +114,18 @@ The client comes packaged with some basic utilities that make it easy to build r
Sending messages to the server from the user is easy.

```dart
client.sendUserMessageContent([
{'type': 'input_text', 'text': 'How are you?'},
await client.sendUserMessageContent([
const ContentPart.inputText(text: 'How are you?'),
]);
// or (empty audio)
client.sendUserMessageContent([
{'type': 'input_audio', 'audio': Uint8List(0)},
await client.sendUserMessageContent([
ContentPart.inputAudio(audio: ''), // Base64 encoded audio
]);
```

### Sending streaming audio

To send streaming audio, use the `.appendInputAudio()` method. If you're in `turn_detection: 'disabled'` mode, then you need to use `.createResponse()` to tell the model to respond.
To send streaming audio, use the `.appendInputAudio()` method. If you're in manual mode (no turn detection), then you need to use `.createResponse()` to tell the model to respond.

```dart
// Send user audio, must be Uint8List
Expand All @@ -132,53 +137,48 @@ for (var i = 0; i < 10; i++) {
final value = (Random().nextDouble() * 2 - 1) * 0x8000;
data[n] = value.toInt();
}
client.appendInputAudio(data);
await client.appendInputAudio(data);
}
// Pending audio is committed and model is asked to generate
client.createResponse();
await client.createResponse();
```

### Adding and using tools

Working with tools is easy. Just call `.addTool()` and set a callback as the second parameter. The callback will be executed with the parameters for the tool, and the result will be automatically sent back to the model.

```dart
// We can add tools as well, with callbacks specified
client.addTool(
{
'name': 'get_weather',
'description': 'Retrieves the weather for a given lat, lng coordinate pair. Specify a label for the location.',
'parameters': {
await client.addTool(
const ToolDefinition(
name: 'get_weather',
description: 'Retrieves the weather for a location given its latitude and longitude coordinate pair.',
parameters: {
'type': 'object',
'properties': {
'lat': {
'type': 'number',
'description': 'Latitude',
'description': 'Latitude of the location',
},
'lng': {
'type': 'number',
'description': 'Longitude',
},
'location': {
'type': 'string',
'description': 'Name of the location',
'description': 'Longitude of the location',
},
},
'required': ['lat', 'lng', 'location'],
'required': ['lat', 'lng'],
},
},
),
(Map<String, dynamic> params) async {
final result = await HttpClient()
.getUrl(
Uri.parse(
'https://api.open-meteo.com/v1/forecast?'
'latitude=${params['lat']}&'
'longitude=${params['lng']}&'
'current=temperature_2m,wind_speed_10m',
'latitude=${params['lat']}&'
'longitude=${params['lng']}&'
'current=temperature_2m,wind_speed_10m',
),
)
.then((request) => request.close())
.then((response) => response.transform(Utf8Decoder()).join())
.then((res) => res.transform(const Utf8Decoder()).join())
.then(jsonDecode);
return result;
},
Expand All @@ -189,19 +189,17 @@ client.addTool(

The `.addTool()` method automatically runs a tool handler and triggers a response on handler completion. Sometimes you may not want that, for example: using tools to generate a schema that you use for other purposes.

In this case, we can use the `tools` item with `updateSession`. In this case you **must** specify `type: 'function'`, which is not required for `.addTool()`.
In this case, we can use the `tools` parameter with `updateSession`.

**Note:** Tools added with `.addTool()` will **not** be overridden when updating sessions manually like this, but every `updateSession()` change will override previous `updateSession()` changes. Tools added via `.addTool()` are persisted and appended to anything set manually here.

```dart
client.updateSession(
await client.updateSession(
tools: [
{
'type': 'function',
'name': 'get_weather',
'description':
'Retrieves the weather for a given lat, lng coordinate pair. Specify a label for the location.',
'parameters': {
const ToolDefinition(
name: 'get_weather',
description: 'Retrieves the weather for a location given its latitude and longitude coordinate pair.',
parameters: {
'type': 'object',
'properties': {
'lat': {
Expand All @@ -212,124 +210,118 @@ client.updateSession(
'type': 'number',
'description': 'Longitude',
},
'location': {
'type': 'string',
'description': 'Name of the location',
},
},
'required': ['lat', 'lng', 'location'],
'required': ['lat', 'lng'],
},
},
),
],
);
```

Then, to handle function calls...

```dart
client.on('conversation.item.completed', (event) {
final item = event?['item'] as Map<String, dynamic>?;
if (item?['type'] == 'function_call') {
client.on(RealtimeEventType.conversationItemCompleted, (event) {
final item = (event as RealtimeEventConversationItemCompleted).item;
if (item.item is ItemFunctionCall) {
// your function call is complete, execute some custom code
}
});
```

### Interrupting the model

You may want to manually interrupt the model, especially in `turn_detection: 'disabled'` mode. To do this, we can use:
You may want to manually interrupt the model, especially when not using turn detection. To do this, we can use:

```dart
// id is the id of the item currently being generated
// sampleCount is the number of audio samples that have been heard by the listener
client.cancelResponse(id, sampleCount);
await client.cancelResponse(id, sampleCount);
```

This method will cause the model to immediately cease generation, but also truncate the item being played by removing all audio after `sampleCount` and clearing the text response. By using this method you can interrupt the model and prevent it from "remembering" anything it has generated that is ahead of where the user's state is.

## Client events

If you need more manual control and want to send custom client events according to the [Realtime Client Events API Reference](https://platform.openai.com/docs/api-reference/realtime-client-events), you can use `client.realtime.send()` like so:
The `RealtimeClient` provides strongly typed events that map to the [Realtime API Events](https://platform.openai.com/docs/api-reference/realtime-events). You can listen to specific events using the `RealtimeEventType` enum.

```dart
client.realtime.send('conversation.item.create', {
'item': {
'type': 'function_call_output',
'call_id': 'my-call-id',
'output': '{function_succeeded:true}',
},
});
client.realtime.send(
RealtimeEvent.conversationItemCreate(
eventId: RealtimeUtils.generateId(),
item: Item.functionCallOutput(
id: RealtimeUtils.generateId(),
callId: 'my-call-id',
output: '{function_succeeded:true}',
),
),
);
```

### Utility events

With `RealtimeClient` we have reduced the event overhead from server events to **five** main events that are most critical for your application control flow. These events **are not** part of the API specification itself, but wrap logic to make application development easier.
With `RealtimeClient` we have reduced the event overhead from server events to **five** main events that are most critical for your application control flow:

```dart
// Errors like connection failures
client.on('error', (event) {
// do something
client.on(RealtimeEventType.error, (event) {
final error = (event as RealtimeEventError).error;
// do something with the error
});

// In VAD mode, the user starts speaking
// we can use this to stop audio playback of a previous response if necessary
client.on('conversation.interrupted', (event) {
// do something
client.on(RealtimeEventType.conversationInterrupted, (event) {
// handle interruption
});

// Includes all changes to conversations
// delta may be populated
client.on('conversation.updated', (event) {
final item = event?['item'] as Map<String, dynamic>?;
final delta = event?['delta'] as Map<String, dynamic>?;
client.on(RealtimeEventType.conversationUpdated, (event) {
final result = (event as RealtimeEventConversationUpdated).result;
final item = result.item;
final delta = result.delta;

// get all items, e.g. if you need to update a chat window
final items = client.conversation.getItems();

final type = item?['type'] as String?;
switch (type) {
case 'message':
// system, user, or assistant message (item.role)
case 'function_call':
// always a function call from the model
case 'function_call_output':
// always a response from the user / application
if (item?.item case final ItemMessage message) {
// system, user, or assistant message (message.role)
} else if (item?.item case final ItemFunctionCall functionCall) {
// always a function call from the model
} else if (item?.item case final ItemFunctionCallOutput functionCallOutput) {
// always a response from the user / application
}

if (delta != null) {
// Only one of the following will be populated for any given event
// delta['audio'] -> Uint8List, audio added
// delta['transcript'] -> string, transcript added
// delta['arguments'] -> string, function arguments added
// delta.audio -> Uint8List, audio added
// delta.transcript -> string, transcript added
// delta.arguments -> string, function arguments added
}
});

// Only triggered after item added to conversation
client.on('conversation.item.appended', (event) {
final item = event?['item'] as Map<String, dynamic>?;
// item?['status'] -> can be 'in_progress' or 'completed'
client.on(RealtimeEventType.conversationItemAppended, (event) {
final item = (event as RealtimeEventConversationItemAppended).item;
// item.status can be ItemStatus.inProgress or ItemStatus.completed
});

// Only triggered after item completed in conversation
// will always be triggered after conversation.item.appended
client.on('conversation.item.completed', (event) {
final item = event?['item'] as Map<String, dynamic>?;
// item?['status'] -> will always be 'completed'
client.on(RealtimeEventType.conversationItemCompleted, (event) {
final item = (event as RealtimeEventConversationItemCompleted).item;
// item.status will always be ItemStatus.completed
});
```

### Server events

If you want more control over your application development, you can use the `realtime.event` event and choose only to respond to **server** events. The full documentation for these events are available on the [Realtime Server Events API Reference](https://platform.openai.com/docs/api-reference/realtime-server-events).
If you want more control over your application development, you can use the `RealtimeEventType.all` event and choose only to respond to **server** events. The full documentation for these events are available on the [Realtime Server Events API Reference](https://platform.openai.com/docs/api-reference/realtime-server-events).

```dart
// all events, can use for logging, debugging, or manual event handling
client.on('realtime.event', (event ) {
final time = event?['time'] as String?;
final source = event?['source'] as String?;
final eventPayload = event?['event'] as Map<String, dynamic>?;
if (source == 'server') {
// do something
}
client.on(RealtimeEventType.all, (event) {
// Handle any RealtimeEvent
});
```

Expand Down
13 changes: 13 additions & 0 deletions packages/openai_realtime_dart/build.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
targets:
$default:
builders:
source_gen|combining_builder:
options:
ignore_for_file:
- prefer_final_parameters
- require_trailing_commas
- non_constant_identifier_names
- unnecessary_null_checks
json_serializable:
options:
explicit_to_json: true
Loading
Loading