-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue #9, Developing a Search Function Test Utilizing LLM #47
Merged
Merged
Changes from 1 commit
Commits
Show all changes
149 commits
Select commit
Hold shift + click to select a range
f3db963
issue #41 - creation of the sql view
melanie-fressard 9362d33
issue #44: fix package configuration
k-allagbe acb8b5d
issue #44: fix module import
k-allagbe 17c7877
Merge remote-tracking branch 'origin/main' into 41-individual-scoring
melanie-fressard da2ad52
issue #41 - adding avg score
melanie-fressard 8ba87c8
modification of .env.template to match standards
melanie-fressard 9adc2d8
imports modification
melanie-fressard 31df218
Fixes #9, new script
4f94a84
issue#24, change louis.db for ailab.db
c90769a
issue #41 - creation of init in ailab/db/finesse
melanie-fressard e668662
issue #20 - moving search.py
melanie-fressard 30f6069
issue #41 - script to use the code
melanie-fressard 67eca6e
issue #41 - debbug connexion db
melanie-fressard c97cd7d
issue #41 - debbug sql function
melanie-fressard 21c6614
issue #41 - formatting
melanie-fressard 403f782
issue #41 - separating creation and select
melanie-fressard 32bc58a
issue #41 - csv file output + correction sql
melanie-fressard a1d7ab3
issue #41 - suppressing similarity column
melanie-fressard a3d10d2
issue #41 - reworking output making
melanie-fressard 4924fc7
changes in template
melanie-fressard 68cf448
issue #49 - modification from changing repo name
melanie-fressard 3c38ab7
issue #49 - results file by query name
melanie-fressard ecfc757
Adding base script
JolanThomassin a687fdc
adressing issue #48 and final line
melanie-fressard 92f327a
issue #49 - minor correction + script to launch
melanie-fressard 08401f8
Fixes #9, Query and LLM Q&A generation
JolanThomassin 7a195f1
changes of the call pf the search function
melanie-fressard 6888691
issue #49 - search output
melanie-fressard 67d5b63
line eof
melanie-fressard 8f34a6c
Enhances Chunk Selection Quality - Resolves #9
JolanThomassin 93841ef
lintests fails attempt to correct
melanie-fressard 16a5528
lintests fails attempt to correct
melanie-fressard 4363ef1
Merge branch '49-expand-search-examples' of https://github.com/ai-cfi…
melanie-fressard baaa739
issue #41 - eof line
melanie-fressard 963c049
issue #49 - secrets correction
melanie-fressard 7f436f1
issue #41 - eof without tabs
melanie-fressard bb483ed
issue #49 - removing search balise
melanie-fressard 1e40ae9
eof
melanie-fressard 009d2de
Fixes #9, new SQL files
JolanThomassin 53758f5
issue #41 - avg instead of sum
melanie-fressard eb5e1e1
Fixes #9, save, check tokens, better query
JolanThomassin 9bd6b6c
issue #49 - linttest correction
melanie-fressard 27bec9f
test
melanie-fressard ffb7126
issue #31 - correction of lint test
melanie-fressard 7f4d360
Merge pull request #50 from ai-cfia/49-expand-search-examples
melanie-fressard 9428c57
issue #31 - resolve lint test
melanie-fressard 080a005
issue #31 - lint test merge
melanie-fressard 352b767
issue #41 - solve merge conflict
melanie-fressard 31ee685
Merge branch 'main' into 41-individual-scoring
melanie-fressard 776cbfb
Merge pull request #42 from ai-cfia/41-individual-scoring
melanie-fressard 6446b30
issue #54: workflow call to main
vivalareda f5ca3e4
Merge pull request #55 from ai-cfia/issue-54-fix-workflow-for-ailab-db
vivalareda 8871006
issue #56: run deploy only on main
vivalareda b862a4a
Merge branch 'main' into k-allagbe/issue44-package-submodule-configur…
k-allagbe 3d44f7c
Merge pull request #60 from ai-cfia/k-allagbe/issue44-package-submodu…
k-allagbe f9e0bc7
Bump certifi from 2023.5.7 to 2023.7.22
dependabot[bot] e7cec22
Bump urllib3 from 2.0.3 to 2.0.7
dependabot[bot] d8079a5
Bump aiohttp from 3.8.4 to 3.8.6
dependabot[bot] 59cf33b
Merge branch 'main' into 56-run-deploy-workflow-step-only-if-branch-i…
vivalareda ffe988a
Merge pull request #57 from ai-cfia/56-run-deploy-workflow-step-only-…
vivalareda 021b3cf
Merge branch 'main' into dependabot/pip/certifi-2023.7.22
melanie-fressard 782a6a2
Merge branch 'main' into dependabot/pip/urllib3-2.0.7
melanie-fressard 7ece22e
Merge branch 'main' into dependabot/pip/aiohttp-3.8.6
melanie-fressard e9164f6
Merge pull request #53 from ai-cfia/dependabot/pip/urllib3-2.0.7
melanie-fressard 6ddcca1
Merge branch 'main' into dependabot/pip/aiohttp-3.8.6
melanie-fressard f372a04
Merge pull request #51 from ai-cfia/dependabot/pip/aiohttp-3.8.6
melanie-fressard 6c382d9
Merge branch 'main' into dependabot/pip/certifi-2023.7.22
melanie-fressard f29d8d6
Merge pull request #52 from ai-cfia/dependabot/pip/certifi-2023.7.22
melanie-fressard eea27c0
Bump aiohttp from 3.8.6 to 3.9.0
dependabot[bot] 6e58fa0
Fixes #9, new SQL script
JolanThomassin 62ac36d
issue #61 - add md file
melanie-fressard 5733ba0
log on schema
melanie-fressard 8b6b929
fix of lint tests
melanie-fressard 1bfa1cf
ruff error
melanie-fressard 8607c58
Merge pull request #63 from ai-cfia/dependabot/pip/aiohttp-3.9.0
melanie-fressard 6525f59
issue #61 - adding jolan's scores
melanie-fressard 9b21633
issue #61 - adding a title and last updated date
melanie-fressard f83b7b3
Fixes #9, refactored code
JolanThomassin 8fca008
Fixes #9, black formatter
JolanThomassin 27b5df4
issue #61 - added similarity
melanie-fressard 0e0365c
issue #61 - add file where each score is computed
melanie-fressard 3e93698
Merge branch 'main' into 61-explain-scores-and-weights
melanie-fressard f1a39a4
issue #61 - removing common standards
melanie-fressard 6ae4b00
Merge remote-tracking branch 'refs/remotes/origin/61-explain-scores-a…
melanie-fressard 1f06baa
issue #61 - adding scale for each score
melanie-fressard 656f100
issue #61 - changing description of didactic
melanie-fressard c8d5b51
issue #61 - link to file
melanie-fressard 40f270e
adding future scores
melanie-fressard 7db06b4
Merge pull request #64 from ai-cfia/61-explain-scores-and-weights
melanie-fressard 5e716ce
Fixes #9, new SQL scripts
JolanThomassin 8d2e67f
Fixes #9, code clarification
JolanThomassin afffd40
Fixes #9, unit test for search qna function
JolanThomassin 42965b0
Fixes #9, set schema fix
JolanThomassin 2c0e141
Fixes #9, adding seed to get random chunk
JolanThomassin 4f869e7
Fixes #9, script rename
JolanThomassin 12448f0
Fixes #9, delete old script
JolanThomassin 86adfa3
Fixes #9, renaming scripts mistakes
JolanThomassin 9e4aa52
Fixes #9, cursor only open once
JolanThomassin a918685
Fixes #9, black formatter
JolanThomassin e9e0533
Fixes #9, character length
4f8664d
Fixes #9, magic string
6d85615
Fixes #9, argparse
384e58e
Fixes #9, new script
9fa9902
issue#24, change louis.db for ailab.db
6bc85ed
Fixes #9, Query and LLM Q&A generation
JolanThomassin 38d8e42
Enhances Chunk Selection Quality - Resolves #9
JolanThomassin c8e68bb
Fixes #9, new SQL files
JolanThomassin 02252f0
Fixes #9, save, check tokens, better query
JolanThomassin 12b3042
Fixes #9, new SQL script
JolanThomassin 5116063
Fixes #9, refactored code
JolanThomassin e509c4b
Fixes #9, black formatter
JolanThomassin 9b2f83b
Fixes #9, new SQL scripts
JolanThomassin f3585d3
Fixes #9, code clarification
JolanThomassin 3b89bf6
Fixes #9, unit test for search qna function
JolanThomassin b172f5f
Fixes #9, set schema fix
JolanThomassin 8fb2add
Fixes #9, adding seed to get random chunk
JolanThomassin daffde7
Fixes #9, script rename
JolanThomassin f1d6ec5
Fixes #9, delete old script
JolanThomassin 131830f
Fixes #9, renaming scripts mistakes
JolanThomassin 8df4ba4
Fixes #9, cursor only open once
JolanThomassin 0e9dfe5
Fixes #9, black formatter
JolanThomassin a625206
Fixes #9, character length
408acb5
Fixes #9, magic string
53e7494
Fixes #9, argparse
d08f373
Merge remote-tracking branch 'origin/issue#9-search-function-test-jt'…
2565a5c
Fixes #9, fixed ruff error
3d1266d
Fixes #9, file rename
c25f8fc
Fixes #9, test first function
a110fad
Fixes #9, clearer JSON template
JolanThomassin 9215998
Fixes #9, missing line break
JolanThomassin 4c02761
Fixes #9, path changes for test
JolanThomassin b1a769e
Fixes #9, new ENV var for schema
JolanThomassin 6b3ede6
Fixes #9, test_generate_question
JolanThomassin ee5374e
Fixes #9, lint ruff error
JolanThomassin b721451
Fixes #9, add black formatter extension
JolanThomassin 418399b
Fixes #9, add semver to requirements
JolanThomassin 39e159d
Fixes #9, changes semver version
JolanThomassin b86fbe9
Fixes #9, test for db failure
JolanThomassin b7e5b56
Fixes #9, replace sys.exit(1)
JolanThomassin 2046aa7
Fixes #9, import removed
JolanThomassin a40b7d5
Fixes #9, separate save for test
JolanThomassin 7ecb190
Fixes #9, import at the top
JolanThomassin 252d3b3
Fixes #9, import at the top
JolanThomassin d19bb55
Fixes #9, fixed number of generated question
JolanThomassin f1b016a
Fixes #9, adding "question_quality" variable
JolanThomassin ca58322
Fixes #9, random query new method
JolanThomassin 8cd1664
Fixes #9, adding parameter into call
JolanThomassin fa67e42
Fixes #9, user_prompt more example
JolanThomassin 71a436f
Fixes #9, remove "question_quality" variable
JolanThomassin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
-- Set the search path to the louis_006 schema | ||
SET search_path TO louis_006; | ||
|
||
CREATE TABLE IF NOT EXISTS chunk_score ( | ||
id UUID, | ||
score FLOAT, | ||
score_type VARCHAR(50) | ||
); | ||
|
||
TRUNCATE TABLE chunk_score; | ||
|
||
INSERT INTO chunk_score (id, score, score_type) | ||
SELECT | ||
ch.id, -- Use the id column from the chunk table | ||
ROUND( | ||
( | ||
LENGTH(hc.content) - length_values.min_val | ||
) * 1.0 / (length_values.max_val - length_values.min_val), | ||
1 | ||
) AS tr_proportion, | ||
'didactic' AS score_type | ||
FROM | ||
louis_006.chunk ch | ||
INNER JOIN louis_006.html_content_to_chunk hctc ON ch.id = hctc.chunk_id | ||
INNER JOIN louis_006.html_content hc ON hctc.md5hash = hc.md5hash | ||
CROSS JOIN ( | ||
SELECT | ||
MIN(LENGTH(content)) AS min_val, | ||
MAX(LENGTH(content)) AS max_val | ||
FROM | ||
louis_006.chunk ch | ||
INNER JOIN louis_006.html_content_to_chunk hctc ON ch.id = hctc.chunk_id | ||
INNER JOIN louis_006.html_content hc ON hctc.md5hash = hc.md5hash | ||
) AS length_values | ||
ORDER BY | ||
tr_proportion DESC; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
SELECT | ||
score, | ||
count(*) as count | ||
FROM louis_006.chunk_score | ||
GROUP BY score | ||
ORDER BY score; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
SELECT table_name, column_name, data_type | ||
FROM information_schema.columns | ||
WHERE table_schema = 'louis_006'; |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorting through all the table to pick up a single element is enormously expensive.
how about:
https://www.postgresql.org/docs/current/queries-limit.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JolanThomassin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still not addressed @JolanThomassin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it, it look a little faster now, but the quality stay the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely way faster.