Skip to content

Commit d83b1ff

Browse files
committed
Deprecated functionality
1 parent 561804f commit d83b1ff

File tree

3 files changed

+7
-2
lines changed

3 files changed

+7
-2
lines changed

README.md

+5
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,11 @@ HUGGING_FACE_HUB_TOKEN="<YOUR HF HUB TOKEN>"
4343
python -m llm_inference --model "cmarkea/bloomz-3b-retriever-v2" --task EMBEDDING
4444
```
4545

46+
The server is designed to run one task at a time. There are three different tasks:
47+
- EMBEDDING
48+
- SCORING
49+
- GUARDRAIL
50+
4651
### API Endpoints
4752

4853
You can access server documentation through this endpoint : `/docs`

llm_inference/routes/guardrail.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ def inference(request: GuardrailRequest) -> GuardrailResponse:
2929
try:
3030
with metrics.BATCH_INFERENCE_TIME.time():
3131
outputs = ServerPipeline().pipeline(
32-
request.text, function_to_apply="sigmoid", return_all_scores=True
32+
request.text, function_to_apply="sigmoid", top_k=None
3333
)
3434
except Exception as e:
3535
metrics.REQUEST_FAILURE.inc()

llm_inference/routes/scoring.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ def inference(request: ScoringRequest) -> ScoringResponse:
3434
for context in request.contexts
3535
],
3636
function_to_apply="softmax",
37-
return_all_scores=True,
37+
top_k=None,
3838
)
3939
except Exception as e:
4040
metrics.REQUEST_FAILURE.inc()

0 commit comments

Comments
 (0)