-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
removed unnecessary join of observation fact onto a subquery #13
base: master
Are you sure you want to change the base?
Conversation
…solely from observation fact by refactoring the addition of observation blobs to happen in the former subquery instead of being joined in
Jason, |
Lori, I'll pull some logs today and post the queries here for your review. Thanks, Jason |
This reverts commit 287030e.
…derived solely from observation fact by refactoring the addition of observation blobs to happen in the former subquery instead of being joined in" This reverts commit 9b64588.
…solely from observation fact when no blob data is being pulled in, a more complicated refactor is required to make the same performance improvement for queries involving observation_blobs
Hi Lori, I have reverted the original fix as it broke queries that include observation_blob data. Our PIC-SURE API currently doesn't support observation_blobs, so no queries had been generated that exercised that code path. The new refactor limits the change to only affecting queries that do not involve observation_blobs. If you run this XML query:
A resulting SQL queries look like this: INSERT INTO GLOBAL_TEMP_FACT_PARAM_TABLE (char_param1) If you run the following XML which includes the observation_blob :
The resulting SQL queries are unchanged and look like this: INSERT INTO i2b2demodata.GLOBAL_TEMP_FACT_PARAM_TABLE (char_param1) The performance hit comes from evaluating all the where clauses for the join. This can be avoided for the observation_blob including code paths also by adjusting the inner-most query, but the logic to do so is much more complicated because of the way the SQL is generated. I'd like to (in all that spare time I have ;D ) take a crack at refactoring how the total SQL is generated for this DAO to be template based, but that will have to wait for another day as it would be a substantial effort. There is some significant potential for further performance improvement as the same SQL query seems to be run 5 times in the process of handling a single XML request. This would a much larger architectural change. Thanks for reviewing my pull request and making me fix it! Any feedback is always appreciated! Jason |
Since the subquery is derived solely from observation fact I was able to refactor the addition of observation blobs to happen in the former subquery instead of being joined in the outer query.