Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for website when running on case-sensitive filesystems. #753

Merged
merged 2 commits into from
May 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion LLama.Web/LLama.Web.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
</ItemGroup>

<ItemGroup>
<PackageReference Include="Microsoft.AspNetCore.Mvc.Razor.RuntimeCompilation" Version="7.0.18" />
<PackageReference Include="Microsoft.AspNetCore.Mvc.Razor.RuntimeCompilation" Version="8.0.5" />
<PackageReference Include="System.Linq.Async" Version="6.0.1" />
</ItemGroup>

Expand Down
2 changes: 1 addition & 1 deletion LLama.Web/Pages/Index.cshtml
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@
}

@section Scripts {
<script src="~/js/sessionconnectionchat.js"></script>
<script src="~/js/sessionConnectionChat.js"></script>
<script>
createConnectionSessionChat();
</script>
Expand Down
1 change: 0 additions & 1 deletion LLama.Web/Pages/Shared/_Layout.cshtml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
<title>@ViewData["Title"] - LLamaSharp Web</title>
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-QWTKZyjpPEjISv5WaRU9OFeRpok6YctnYmDr5pNlyT2bRjXh0JMhjY6hW+ALEwIH" crossorigin="anonymous">
<link rel="stylesheet" href="~/css/site.css" asp-append-version="true" />
<link rel="stylesheet" href="~/LLama.Web.styles.css" asp-append-version="true" />
<link rel="icon" href="~/image/llama-sharp.png" sizes="32x32" type="image/png">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/font/bootstrap-icons.min.css">
</head>
Expand Down
4 changes: 2 additions & 2 deletions LLama.Web/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
if (builder.Environment.IsDevelopment())
{
mvcBuilder.AddRazorRuntimeCompilation();
builder.Configuration.AddJsonFile("appSettings.Local.json");
builder.Configuration.AddJsonFile("appsettings.Local.json", true);
}

builder.Services.AddSignalR();
Expand Down Expand Up @@ -47,4 +47,4 @@

app.MapHub<SessionConnectionHub>(nameof(SessionConnectionHub));

app.Run();
app.Run();
3 changes: 0 additions & 3 deletions LLama.Web/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,6 @@
"AllowedHosts": "*",
"LLamaOptions": {
"ModelLoadType": 0,

// If you would like to add your own local model files then it's best to create an appSettings.Local.json file
// and add them there. The appSettings.Local.json file will be ignored by Git.
"Models": [
{
"Name": "Example LLama2-7b-Chat",
Expand Down
17 changes: 11 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,12 +172,17 @@ For more examples, please refer to [LLamaSharp.Examples](./LLama.Examples).

## 💡FAQ

#### Why GPU is not used when I have installed CUDA
#### Why is my GPU not used when I have installed CUDA?

1. If you are using backend packages, please make sure you have installed the CUDA backend package which matches the CUDA version install on your system. Please note that before LLamaSharp v0.10.0, only one backend package should be installed at a time.
2. Add `NativeLibraryConfig.Instance.WithLogCallback(delegate (LLamaLogLevel level, string message) { Console.Write($"{level}: {message}"); } )` to the very beginning of your code. The log will show which native library file is loaded. If the CPU library is loaded, please try to compile the native library yourself and open an issue for that. If the CUDA library is loaded, please check if `GpuLayerCount > 0` when loading the model weight.
1. If you are using backend packages, please make sure you have installed the CUDA backend package which matches the CUDA version installed on your system. Please note that before LLamaSharp v0.10.0, only one backend package should be installed at a time.
2. Add the following line to the very beginning of your code. The log will show which native library file is loaded. If the CPU library is loaded, please try to compile the native library yourself and open an issue for that. If the CUDA library is loaded, please check if `GpuLayerCount > 0` when loading the model weight.

#### Why the inference is slow
```cs
NativeLibraryConfig.Instance.WithLogCallback(delegate (LLamaLogLevel level, string message) { Console.Write($"{level}: {message}"); } )
```


#### Why is the inference so slow?

Firstly, due to the large size of LLM models, it requires more time to generate output than other models, especially when you are using models larger than 30B parameters.

Expand All @@ -187,14 +192,14 @@ To see if that's a LLamaSharp performance issue, please follow the two tips belo
2. If it's still slower than you expect it to be, please try to run the same model with same setting in [llama.cpp examples](https://github.com/ggerganov/llama.cpp/tree/master/examples). If llama.cpp outperforms LLamaSharp significantly, it's likely a LLamaSharp BUG and please report that to us.


#### Why is the program crashing before any output is generated
#### Why does the program crash before any output is generated?

Generally, there are two possible cases for this problem:

1. The native library (backend) you are using is not compatible with the LLamaSharp version. If you compiled the native library yourself, please make sure you have checked-out llama.cpp to the corresponding commit of LLamaSharp, which can be found at the bottom of README.
2. The model file you are using is not compatible with the backend. If you are using a GGUF file downloaded from huggingface, please check its publishing time.

#### Why my model is generating output infinitely
#### Why is my model generating output infinitely?

Please set anti-prompt or max-length when executing the inference.

Expand Down
Loading