Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/default to maindb if read replica errs #1221

Merged
merged 4 commits into from
Jun 5, 2024
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions src/services/request-service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,12 @@ export class RequestService {
* @returns The request.
*/
async getStatusForCid(cid: CID): Promise<OutputOf<typeof CASResponse> | { error: string }> {
let request = await this.replicationRequestRepository.findByCid(cid)
let request
try {
request = await this.replicationRequestRepository.findByCid(cid)
} catch (e) {
logger.err(`Error fetching request from replica db for ${cid}, error: ${e}`)
}
if (!request) {
logger.debug(`Request not found in replica db for ${cid}, fetching from main_db`)
Metrics.count(METRIC_NAMES.REPLICA_DB_REQUEST_NOT_FOUND, 1)
Expand All @@ -79,11 +84,16 @@ export class RequestService {
* @returns The request.
*/
async findByCid(cid: CID): Promise<OutputOf<typeof CASResponse> | undefined> {
let found = await this.replicationRequestRepository.findByCid(cid)
let found
try {
found = await this.replicationRequestRepository.findByCid(cid)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to expect the replica to work if it exists, but default the config to main if there is none defined. I think we should expect it to work if it's defined, but it should be optional.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of implementation, I'd expect we either set up a RO connection pool, or we just use the main pool and the app still calls replicationRequestRepository for all reads because it doesn't require the writer, without needing to know anything about the actual DB used.

This also avoids doing double reads for things that don't actually exist and potentially getting hung trying to talk to a non-exists RO instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it all around a better design to make the read replication optional. FYI, currently all our reads do not got to the replica. If the configuration for the read replica is not specified or if any of the config params are missing, we default to the maindb. Will add a commit for this and tag you, let me know if this might fix it or there might be something else I missed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the changes you made, do we still need this fallback? if it fails now, it seems like it's failing for a reason so there's not sure we need to try another database.

Copy link
Contributor Author

@samika98 samika98 Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, in the case of the read replica not having the request. If sync b/w the db's is slow

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what I was trying to say is that if the RR doesn't have the request, it's unlikely the main DB does so there's no reason to ask. If for some reason we're far enough behind on replicating (i.e. minutes), we probably don't want to load the main database more and should investigate. Otherwise, we can tolerate the client retrying again in a few seconds and the RR will respond with data.

} catch (e) {
logger.err(`Error fetching request from replica db for ${cid}, error: ${e}`)
}
if (!found) {
found = await this.requestRepository.findByCid(cid)
logger.debug(`Request not found in replica db for ${cid}, fetching from main_db`)
Metrics.count(METRIC_NAMES.REPLICA_DB_REQUEST_NOT_FOUND, 1)
found = await this.requestRepository.findByCid(cid)
if (!found) {
throw new RequestDoesNotExistError(cid)
}
Expand Down