Improve Sync To User performance (batch 3) #1898

prabhanshuguptagit · 2020-12-17T13:11:35Z

Story card: ch2086

Because

Whilst perf-testing block-syncs, we discovered various inefficencies in our sync (GET) code-paths that get exacerbated when many users re-sync at the same time.

A full list of improvements and the rationale is here.

This addresses

This cleans up some cpu-intensive inefficient code from our sync to user pipeline. Most noticeably:

Use OJ instead of the default rails serializer (to_json). This is a drop-in solution and improves our render times by ~2.5x.
Some cleaning up in the transformers.

These are used for CSV upload, no need to serialize them.

rsanheim · 2020-12-17T17:23:05Z

app/controllers/api/v3/sync_controller.rb

-    AuditLog.create_logs_async(current_user, records_to_sync, "fetch", Time.current) unless disable_audit_logs?
+    records = records_to_sync
+
+    AuditLog.create_logs_async(current_user, records, "fetch", Time.current) unless disable_audit_logs?


Can we remove this ? If we still need the logging, we could send it to Datadog I guess, but its adding overhead to the sync path here:

records_by_class = records.group_by { |record| record.class.to_s } records_by_class.each do |record_class, records_for_class| log_data = { user_id: user.id, record_class: record_class, record_ids: records_for_class.map(&:id), action: action, time: time }.to_json

Note that the above is not done async, its done before the handoff to the sidekiq job.

If no-one has looked at these logs in the past month or so, maybe we can just drop them.

We debated this a fair bit, and yes that block of code is synchronous. No one has looked at them not just in the past month or so, but ever, in the last 3 years.

However, the argument for keeping this is a fail-safe I think, it's like insurance.

I'll post some numbers shortly around all of these, which will show that the time consumed by this is relatively small to the other things, which is why we didn't bother cleaning this up.

We can't drop or move these to datadog afaik.

The purpose of this is to be able to audit which users have accessed specific data.
The example scenario we've used for this is: a legal authority asks us for information on which users have accessed data related to x.

Unless something has changed so that this is no longer a scenario we need to provide logs for, we'll need to keep this around in some form. Datadog doesn't work because we only retain data for 30 days there.

Correcto, this is a compliance issue that we need to be bullet-proof against. We'll have to take the perf hit for it. That being said, if there is any data massaging that we can push into an async job, all the better.

Fwiw this takes extremely little time (only ~1% of the response time, ~20ms). I think we can keep this around, even without moving out the data massaging and come back to this later in a fun friday cleanup.

rsanheim · 2020-12-17T17:32:07Z

app/transformers/api/v3/transformer.rb

    end

    def to_response(model)
-      rename_attributes(model.attributes, inverted_key_mapping).as_json
+      rename_attributes(model.attributes, to_response_key_mapping).as_json


Do we need as_json here ? I think to_json does that for us, at so Oj.dump would do the same. We may be invoking as_json twice for every payload, worth testing at least.

I tried removing this in 3fbcacc which caused these specs to fail. I haven't dug into this fully but as_json does something weird with timestamps.

[3] pry(main)> e = Encounter.first => #<Encounter:0x00007fcb8b256f80 .... encountered_on: Thu, 23 Jan 2020, created_at: Thu, 23 Jan 2020 07:36:27 UTC +00:00, updated_at: Thu, 23 Jan 2020 07:36:27 UTC +00:00> [4] pry(main)> e.as_json => { ... "encountered_on"=>Thu, 23 Jan 2020, "created_at"=>Thu, 23 Jan 2020 07:36:27 UTC +00:00, "updated_at"=>Thu, 23 Jan 2020 07:36:27 UTC +00:00} [5] pry(main)> e.as_json.as_json => {... "encountered_on"=>"2020-01-23", "created_at"=>"2020-01-23T07:36:27.163Z", "updated_at"=>"2020-01-23T07:36:27.163Z"}

I didn't want to change this just in case the app is relying on this behaviour, just to decrease the surface of change. Will dig into this some more.

Couldn't figure out a fix here. Calling Oj.add_to_json doesn't have any effect either. Leaving this be for now, want to avoid breaking timestamps for apps unknowingly.

rsanheim · 2020-12-17T17:45:32Z

app/controllers/api/v3/sync_controller.rb

        "process_token" => encode_process_token(response_process_token)
-      },
+      }, mode: :compat),


Have you tried using the Oj optimized versions of encoding for some of the core objects? Trying that for dates, times, arrays and hashes would be interesting to see if it helps serialize time. See https://github.com/ohler55/oj/blob/develop/pages/JsonGem.md ... I think we would need calls like this in an initializer somewhere:

Oj.add_to_json(Array) Oj.add_to_json(Hash) etc...

harimohanraj89 · 2020-12-23T12:14:42Z

app/transformers/api/v3/transformer.rb

-    def inverted_key_mapping
-      key_mapping.invert
+    def to_response_key_mapping
+      from_request_key_mapping.invert


Love this naming change.

harimohanraj89

Looks good to me.

This reverts commit 8ff7ba1.

* Revert "Improve Sync To User performance (batch 3) (#1898)" This reverts commit 8ff7ba1. * Revert "Improve Sync To User performance (batch 2) (#1897)" This reverts commit d78d0ca. * Add the old sync indexes back

We discovered that the perf improvements for block level sync tend to slow down FG syncing. We reverted the improvements (#1920) to unblock deploys while a fix was figured out. This combines the reverted perf improvements (#1898 and #1897) and fixes them to be FG compatible.

prabhanshuguptagit added 5 commits December 17, 2020 17:24

Use OJ to serialize instead of default as_json

f934a26

Remove redundant records_to_sync call

934f650

Replace selected keys only in v3 transformer

9f0d19b

Merge except calls

e0a5f5b

Don't send facility virtual attributes

ea389c8

These are used for CSV upload, no need to serialize them.

harimohanraj89 temporarily deployed to simple-review-pr-1898 December 17, 2020 13:13 Inactive

rsanheim reviewed Dec 17, 2020

View reviewed changes

harimohanraj89 reviewed Dec 23, 2020

View reviewed changes

harimohanraj89 approved these changes Dec 23, 2020

View reviewed changes

Merge branch 'master' into sync-perf-3

d07c37b

vkrmis merged commit 8ff7ba1 into master Dec 23, 2020

vkrmis deleted the sync-perf-3 branch December 23, 2020 13:31

kitallis added a commit that referenced this pull request Dec 24, 2020

Revert "Improve Sync To User performance (batch 3) (#1898)"

7b61d43

This reverts commit 8ff7ba1.

kitallis added a commit that referenced this pull request Dec 24, 2020

Revert "Improve Sync To User performance (batch 3) (#1898)"

ee6b50c

This reverts commit 8ff7ba1.

prabhanshuguptagit mentioned this pull request Dec 25, 2020

Sync performance fixes #1921

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Sync To User performance (batch 3) #1898

Improve Sync To User performance (batch 3) #1898

prabhanshuguptagit commented Dec 17, 2020 •

edited

Loading

rsanheim Dec 17, 2020

kitallis Dec 18, 2020

vkrmis Dec 18, 2020

harimohanraj89 Dec 18, 2020

prabhanshuguptagit Dec 18, 2020

rsanheim Dec 17, 2020

prabhanshuguptagit Dec 18, 2020 •

edited

Loading

prabhanshuguptagit Dec 18, 2020 •

edited

Loading

rsanheim Dec 17, 2020

harimohanraj89 Dec 23, 2020

harimohanraj89 left a comment

Improve Sync To User performance (batch 3) #1898

Improve Sync To User performance (batch 3) #1898

Conversation

prabhanshuguptagit commented Dec 17, 2020 • edited Loading

Because

This addresses

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prabhanshuguptagit Dec 18, 2020 • edited Loading

Choose a reason for hiding this comment

prabhanshuguptagit Dec 18, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harimohanraj89 left a comment

Choose a reason for hiding this comment

prabhanshuguptagit commented Dec 17, 2020 •

edited

Loading

prabhanshuguptagit Dec 18, 2020 •

edited

Loading

prabhanshuguptagit Dec 18, 2020 •

edited

Loading