Skip to content

Commit 9345bb9

Browse files
committed
fix: preprocess reference in BM25 algorithm
1 parent 8ec668a commit 9345bb9

File tree

2 files changed

+3
-2
lines changed

2 files changed

+3
-2
lines changed

findlike/wrappers.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ def fit(self, documents: list[str]):
5959
self._model = BM25Okapi(self.tokenized_documents_)
6060

6161
def get_scores(self, source: str):
62-
tokenized_source = self.processor.tokenizer(source)
62+
clean_source = self.processor.preprocessor(source)
63+
tokenized_source = self.processor.tokenizer(clean_source)
6364
scores = self._model.get_scores(tokenized_source)
6465
return scores

pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "findlike"
3-
version = "1.4.1"
3+
version = "1.4.2"
44
authors = [{ name = "Bruno Arine", email = "[email protected]" }]
55
description = "findlike is a package to retrieve similar documents"
66
readme = "README.md"

0 commit comments

Comments
 (0)