-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Sync To User performance (batch 3) #1898
Changes from all commits
f934a26
934f650
9f0d19b
e0a5f5b
ea389c8
d07c37b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -606,6 +606,7 @@ DEPENDENCIES | |
memery | ||
memory_profiler | ||
mock_redis | ||
oj | ||
ougai | ||
parallel | ||
parallel_tests | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,12 +11,14 @@ def __sync_from_user__(params) | |
end | ||
|
||
def __sync_to_user__(response_key) | ||
AuditLog.create_logs_async(current_user, records_to_sync, "fetch", Time.current) unless disable_audit_logs? | ||
records = records_to_sync | ||
|
||
AuditLog.create_logs_async(current_user, records, "fetch", Time.current) unless disable_audit_logs? | ||
render( | ||
json: { | ||
response_key => records_to_sync.map { |record| transform_to_response(record) }, | ||
json: Oj.dump({ | ||
response_key => records.map { |record| transform_to_response(record) }, | ||
"process_token" => encode_process_token(response_process_token) | ||
}, | ||
}, mode: :compat), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have you tried using the Oj optimized versions of encoding for some of the core objects? Trying that for dates, times, arrays and hashes would be interesting to see if it helps serialize time. See https://github.com/ohler55/oj/blob/develop/pages/JsonGem.md ... I think we would need calls like this in an initializer somewhere:
|
||
status: :ok | ||
) | ||
end | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,31 +1,33 @@ | ||
class Api::V3::Transformer | ||
class << self | ||
def from_request(attributes_of_payload) | ||
rename_attributes(attributes_of_payload, key_mapping) | ||
def from_request(payload_attributes) | ||
rename_attributes(payload_attributes, from_request_key_mapping) | ||
end | ||
|
||
def to_response(model) | ||
rename_attributes(model.attributes, inverted_key_mapping).as_json | ||
rename_attributes(model.attributes, to_response_key_mapping).as_json | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried removing this in 3fbcacc which caused these specs to fail. I haven't dug into this fully but
I didn't want to change this just in case the app is relying on this behaviour, just to decrease the surface of change. Will dig into this some more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Couldn't figure out a fix here. Calling |
||
end | ||
|
||
def rename_attributes(attributes, mapping) | ||
mapping = mapping.with_indifferent_access | ||
attributes | ||
.to_hash | ||
.except(*mapping.values) | ||
.transform_keys! { |key| mapping[key] || key } | ||
.with_indifferent_access | ||
replace_keys(attributes.to_hash, mapping).with_indifferent_access | ||
end | ||
|
||
def key_mapping | ||
def replace_keys(hsh, mapping) | ||
mapping.each do |k, v| | ||
hsh[v] = hsh.delete(k) | ||
end | ||
hsh | ||
end | ||
|
||
def from_request_key_mapping | ||
{ | ||
"created_at" => "device_created_at", | ||
"updated_at" => "device_updated_at" | ||
} | ||
end | ||
|
||
def inverted_key_mapping | ||
key_mapping.invert | ||
def to_response_key_mapping | ||
from_request_key_mapping.invert | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Love this naming change. |
||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this ? If we still need the logging, we could send it to Datadog I guess, but its adding overhead to the sync path here:
Note that the above is not done async, its done before the handoff to the sidekiq job.
If no-one has looked at these logs in the past month or so, maybe we can just drop them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We debated this a fair bit, and yes that block of code is synchronous. No one has looked at them not just in the past month or so, but ever, in the last 3 years.
However, the argument for keeping this is a fail-safe I think, it's like insurance.
I'll post some numbers shortly around all of these, which will show that the time consumed by this is relatively small to the other things, which is why we didn't bother cleaning this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't drop or move these to datadog afaik.
The purpose of this is to be able to audit which users have accessed specific data.
The example scenario we've used for this is: a legal authority asks us for information on which users have accessed data related to x.
Unless something has changed so that this is no longer a scenario we need to provide logs for, we'll need to keep this around in some form. Datadog doesn't work because we only retain data for 30 days there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correcto, this is a compliance issue that we need to be bullet-proof against. We'll have to take the perf hit for it. That being said, if there is any data massaging that we can push into an async job, all the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fwiw this takes extremely little time (only ~1% of the response time, ~20ms). I think we can keep this around, even without moving out the data massaging and come back to this later in a fun friday cleanup.