-
Notifications
You must be signed in to change notification settings - Fork 1
PRSD-1023: Efficient LA Property Search #836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
PRSD-1023: Efficient LA Property Search #836
Conversation
| CREATE TRIGGER update_property_ownership_single_line_address | ||
| AFTER UPDATE OF single_line_address ON address | ||
| FOR EACH ROW | ||
| WHEN (OLD.single_line_address IS DISTINCT FROM NEW.single_line_address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this approach not massively slow down the seeding of the DB with NGD address data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been removed as we'll be referencing property ownerships from addresses rather than the other way around
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, querying the address table is much slower than querying the property ownership table, even with a partial index. I'll go back to referencing addresses from property ownerships. To combat slowing down NGD data loading, we can refresh the addresses once after loading finishes, rather than using this trigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intrigued that it takes longer - would expect it to be at least similar - but yes, if we can update them for each batch in one go during ingest instead of using the trigger that should at least limit the slow-down there
|
Does the 2 second average drop if we disallow searches below a certain number of characters? Or is that the best case? |
src/main/resources/db/migrations/V1_3_0__add_single_line_address_to_property_ownership.sql
Show resolved
Hide resolved
src/main/resources/db/migrations/V1_3_0__add_single_line_address_to_property_ownership.sql
Show resolved
Hide resolved
| @@ -0,0 +1,389 @@ | |||
| -- This script populates the database with n^2 landlords, where 'n' is the cardinality of the name array passed to load_search_data(). | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I paused a bit over doing all of this in SQL though vs doing the manipulation on the host machine in either a Kotlin gradle task or a standalone node/TS script and just doing the batched inserts into the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned on slack - on reflection this is fine given we'll use it so infrequently
src/main/kotlin/uk/gov/communities/prsdb/webapp/database/entity/Address.kt
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments - I'm not 100% sure about doing this all in SQL as assuming that this is run from a dev laptop it'll all execute on the DB instance rather than the (much more powerful) host. It's also not the most readable language for non-trivial logic! But open to being persuaded that we should do it this way
Not necessarily, as longer words that appear in many addresses (e.g. 'Manchester') will still return a large number of results. |
Ticket number
PRSD-1023
Goal of change
Makes LA property search efficient
Description of main change(s)
Anything you'd like to highlight to the reviewer?
Checklist
Delete any that are not applicable, and add explanation below for any that are applicable but haven't been done
and any related functionality)