Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Sync To User performance (batch 3) #1898

Merged
merged 6 commits into from
Dec 23, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ gem "kaminari"
gem "lodash-rails"
gem "lograge"
gem "ougai"
gem "oj"
gem "parallel", require: false
gem "passenger"
gem "pg", ">= 0.18", "< 2.0"
Expand Down
1 change: 1 addition & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -606,6 +606,7 @@ DEPENDENCIES
memery
memory_profiler
mock_redis
oj
ougai
parallel
parallel_tests
Expand Down
10 changes: 6 additions & 4 deletions app/controllers/api/v3/sync_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,14 @@ def __sync_from_user__(params)
end

def __sync_to_user__(response_key)
AuditLog.create_logs_async(current_user, records_to_sync, "fetch", Time.current) unless disable_audit_logs?
records = records_to_sync

AuditLog.create_logs_async(current_user, records, "fetch", Time.current) unless disable_audit_logs?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this ? If we still need the logging, we could send it to Datadog I guess, but its adding overhead to the sync path here:

    records_by_class = records.group_by { |record| record.class.to_s }

    records_by_class.each do |record_class, records_for_class|
      log_data = {
        user_id: user.id,
        record_class: record_class,
        record_ids: records_for_class.map(&:id),
        action: action,
        time: time
      }.to_json

Note that the above is not done async, its done before the handoff to the sidekiq job.

If no-one has looked at these logs in the past month or so, maybe we can just drop them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We debated this a fair bit, and yes that block of code is synchronous. No one has looked at them not just in the past month or so, but ever, in the last 3 years.

However, the argument for keeping this is a fail-safe I think, it's like insurance.

I'll post some numbers shortly around all of these, which will show that the time consumed by this is relatively small to the other things, which is why we didn't bother cleaning this up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't drop or move these to datadog afaik.

The purpose of this is to be able to audit which users have accessed specific data.
The example scenario we've used for this is: a legal authority asks us for information on which users have accessed data related to x.

Unless something has changed so that this is no longer a scenario we need to provide logs for, we'll need to keep this around in some form. Datadog doesn't work because we only retain data for 30 days there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correcto, this is a compliance issue that we need to be bullet-proof against. We'll have to take the perf hit for it. That being said, if there is any data massaging that we can push into an async job, all the better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Fwiw this takes extremely little time (only ~1% of the response time, ~20ms). I think we can keep this around, even without moving out the data massaging and come back to this later in a fun friday cleanup.

render(
json: {
response_key => records_to_sync.map { |record| transform_to_response(record) },
json: Oj.dump({
response_key => records.map { |record| transform_to_response(record) },
"process_token" => encode_process_token(response_process_token)
},
}, mode: :compat),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried using the Oj optimized versions of encoding for some of the core objects? Trying that for dates, times, arrays and hashes would be interesting to see if it helps serialize time. See https://github.com/ohler55/oj/blob/develop/pages/JsonGem.md ... I think we would need calls like this in an initializer somewhere:

Oj.add_to_json(Array)
Oj.add_to_json(Hash)
etc...

status: :ok
)
end
Expand Down
4 changes: 3 additions & 1 deletion app/transformers/api/v3/facility_transformer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ def to_response(facility)
"enable_teleconsultation",
"teleconsultation_phone_number",
"teleconsultation_isd_code",
"teleconsultation_phone_numbers")
"teleconsultation_phone_numbers",
"organization_name",
"facility_group_name")
.merge(config: {enable_diabetes_management: facility.enable_diabetes_management,
enable_teleconsultation: facility.enable_teleconsultation},
protocol_id: facility.protocol.try(:id))
Expand Down
8 changes: 4 additions & 4 deletions app/transformers/api/v3/patient_transformer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,10 @@ def from_nested_request(payload_attributes)

def to_nested_response(patient)
Api::V3::Transformer.to_response(patient)
.except("address_id")
.except("registration_user_id")
.except("test_data")
.except("deleted_by_user_id")
.except("address_id",
"registration_user_id",
"test_data",
"deleted_by_user_id")
.merge(
"address" => Api::V3::Transformer.to_response(patient.address),
"phone_numbers" => patient.phone_numbers.map do |phone_number|
Expand Down
26 changes: 14 additions & 12 deletions app/transformers/api/v3/transformer.rb
Original file line number Diff line number Diff line change
@@ -1,31 +1,33 @@
class Api::V3::Transformer
class << self
def from_request(attributes_of_payload)
rename_attributes(attributes_of_payload, key_mapping)
def from_request(payload_attributes)
rename_attributes(payload_attributes, from_request_key_mapping)
end

def to_response(model)
rename_attributes(model.attributes, inverted_key_mapping).as_json
rename_attributes(model.attributes, to_response_key_mapping).as_json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need as_json here ? I think to_json does that for us, at so Oj.dump would do the same. We may be invoking as_json twice for every payload, worth testing at least.

Copy link
Contributor Author

@prabhanshuguptagit prabhanshuguptagit Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried removing this in 3fbcacc which caused these specs to fail. I haven't dug into this fully but as_json does something weird with timestamps.

[3] pry(main)> e = Encounter.first
=> #<Encounter:0x00007fcb8b256f80
....
 encountered_on: Thu, 23 Jan 2020,
 created_at: Thu, 23 Jan 2020 07:36:27 UTC +00:00,
 updated_at: Thu, 23 Jan 2020 07:36:27 UTC +00:00>
[4] pry(main)> e.as_json
=> { ...
 "encountered_on"=>Thu, 23 Jan 2020,
 "created_at"=>Thu, 23 Jan 2020 07:36:27 UTC +00:00,
 "updated_at"=>Thu, 23 Jan 2020 07:36:27 UTC +00:00}
[5] pry(main)> e.as_json.as_json
=> {...
"encountered_on"=>"2020-01-23",
 "created_at"=>"2020-01-23T07:36:27.163Z",
 "updated_at"=>"2020-01-23T07:36:27.163Z"}

I didn't want to change this just in case the app is relying on this behaviour, just to decrease the surface of change. Will dig into this some more.

Copy link
Contributor Author

@prabhanshuguptagit prabhanshuguptagit Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't figure out a fix here. Calling Oj.add_to_json doesn't have any effect either. Leaving this be for now, want to avoid breaking timestamps for apps unknowingly.

end

def rename_attributes(attributes, mapping)
mapping = mapping.with_indifferent_access
attributes
.to_hash
.except(*mapping.values)
.transform_keys! { |key| mapping[key] || key }
.with_indifferent_access
replace_keys(attributes.to_hash, mapping).with_indifferent_access
end

def key_mapping
def replace_keys(hsh, mapping)
mapping.each do |k, v|
hsh[v] = hsh.delete(k)
end
hsh
end

def from_request_key_mapping
{
"created_at" => "device_created_at",
"updated_at" => "device_updated_at"
}
end

def inverted_key_mapping
key_mapping.invert
def to_response_key_mapping
from_request_key_mapping.invert
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this naming change.

end
end
end
2 changes: 1 addition & 1 deletion spec/utils.rb
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def to_json_and_back

def with_payload_keys
Api::V3::Transformer.rename_attributes(
self, Api::V3::Transformer.inverted_key_mapping
self, Api::V3::Transformer.to_response_key_mapping
)
end
end