You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+41-47Lines changed: 41 additions & 47 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ Features:
8
8
- Remote Inferencing: Perform inferencing tasks remotely with Llama models hosted on a remote connection (or serverless localhost).
9
9
- Simple Integration: With easy-to-use APIs, a developer can quickly integrate Llama Stack in their Android app. The difference with local vs remote inferencing is also minimal.
*Tagged releases are stable versions of the project. While we strive to maintain a stable main branch, it's not guaranteed to be free of bugs or issues.*
14
14
@@ -24,7 +24,7 @@ The key files in the app are `ExampleLlamaStackLocalInference.kt`, `ExampleLlama
24
24
Add the following dependency in your `build.gradle.kts` file:
@@ -247,7 +243,7 @@ Create an image inference with agent:
247
243
)
248
244
```
249
245
250
-
Note that image captured on device needs to be encoded with Base64 before sending it to the model. Check out our demo app example here (TO-ADD Image Reasoning section)
246
+
Note that image captured on device needs to be encoded with Base64 before sending it to the model. Check out our demo app example [here](https://github.com/meta-llama/llama-stack-apps/tree/main/examples/android_app)
251
247
252
248
253
249
### Run Simple Inference
@@ -290,7 +286,7 @@ The purpose of this section is to share more details with users that would like
290
286
### Prerequisite
291
287
292
288
You must complete the following steps:
293
-
1. Clone the repo (`git clone https://github.com/meta-llama/llama-stack-client-kotlin.git -b release/0.0.58`)
289
+
1. Clone the repo (`git clone https://github.com/meta-llama/llama-stack-client-kotlin.git -b release/0.1.0`)
294
290
2. Port the appropriate ExecuTorch libraries over into your Llama Stack Kotlin library environment.
295
291
```
296
292
cd llama-stack-client-kotlin-client-local
@@ -396,9 +392,7 @@ If you encountered any bugs or issues following this guide please file a bug/iss
396
392
397
393
## Known Issues
398
394
We're aware of the following issues and are working to resolve them:
399
-
1. Streaming response is a work-in-progress for local and remote inference
400
-
2. Due to #1, agents are not supported at the time. LS agents only work in streaming mode
401
-
3. Changing to another model is a work in progress for local and remote platforms
395
+
- Because of the different model behavior when handling function calls and special tags such as "ipython", Llama Stack currently returning streaming events payload for Llama 3.2 1B/3B models as textDelta object rather than toolCallDelta object when making a tool call. At the the StepComplete, the Llama Stack will still return the entire toolCall detail.
402
396
403
397
## Thanks
404
398
We'd like to extend our thanks to the ExecuTorch team for providing their support as we integrated ExecuTorch as one of the local inference distributors for Llama Stack. Checkout [ExecuTorch Github repo](https://github.com/pytorch/executorch/tree/main) for more information.
Copy file name to clipboardExpand all lines: llama-stack-client-kotlin-client-local/src/main/kotlin/com/llama/llamastack/client/local/util/ResponseUtil.kt
+15-4Lines changed: 15 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -20,14 +20,12 @@ fun buildInferenceChatCompletionResponse(
20
20
CompletionMessage.builder()
21
21
.toolCalls(createCustomToolCalls(response))
22
22
.content(InterleavedContent.ofString(""))
23
-
// .role(CompletionMessage.Role.ASSISTANT)
24
23
.stopReason(mapStopTokenToReason(stopToken))
25
24
.build()
26
25
} else {
27
26
CompletionMessage.builder()
28
27
.toolCalls(listOf())
29
28
.content(InterleavedContent.ofString(response))
30
-
// .role(CompletionMessage.Role.ASSISTANT)
31
29
.stopReason(mapStopTokenToReason(stopToken))
32
30
.build()
33
31
}
@@ -89,8 +87,21 @@ fun buildInferenceChatCompletionResponseForCustomToolCallStream(
89
87
stopToken:String,
90
88
stats:Float
91
89
): InferenceChatCompletionResponse {
92
-
// Convert ToolCall to ToolCallDelta
93
-
val delta =ContentDelta.ToolCallDelta.builder().toolCall(toolCall.toString()).build()
Copy file name to clipboardExpand all lines: llama-stack-client-kotlin-core/src/test/kotlin/com/llama/llamastack/models/TelemetryQuerySpansParamsTest.kt
-55Lines changed: 0 additions & 55 deletions
Original file line number
Diff line number
Diff line change
@@ -23,59 +23,4 @@ class TelemetryQuerySpansParamsTest {
Copy file name to clipboardExpand all lines: llama-stack-client-kotlin-core/src/test/kotlin/com/llama/llamastack/models/TelemetryQueryTracesParamsTest.kt
-32Lines changed: 0 additions & 32 deletions
Original file line number
Diff line number
Diff line change
@@ -26,38 +26,6 @@ class TelemetryQueryTracesParamsTest {
Copy file name to clipboardExpand all lines: llama-stack-client-kotlin-core/src/test/kotlin/com/llama/llamastack/models/ToolRuntimeListToolsParamsTest.kt
-17Lines changed: 0 additions & 17 deletions
Original file line number
Diff line number
Diff line change
@@ -18,23 +18,6 @@ class ToolRuntimeListToolsParamsTest {
0 commit comments