Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prime Restores - Periodic Task #35875

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Jtang-1
Copy link
Contributor

@Jtang-1 Jtang-1 commented Mar 3, 2025

Product Description

No user facing effects on the UI side. But the intended effect is to improve performance upon form submission for BHA users.

Technical Summary

USH-5780

The cc-skip-fixtures-after-submit custom app property was removed from the BHA application, and the corresponding code was deleted. Following this change, Datadog performance monitoring indicated an increase in slow submissions for BHA forms.

Our current hypothesis is that this slowdown is due to additional fixture data being included in submission incremental syncs, particularly during the first form submission of the day. Typically, users with a cold cache receive a full restore when launching the app. However, past issues have shown that some users access CommCare exclusively through bookmarked links or smartlinks, bypassing the expected full restore process.

This PR will ensure that users in domains with the feature flag enabled get their restores primed, as long as they’ve used Web Apps in the past week (specifically, if they’ve synced in the last seven days). These tasks will run daily from 5 AM to 7 AM GMT-7.

Feature Flag

Adds FF PRIME_FORMPLAYER_DBS_BHA

Safety Assurance

Safety story

Changes are restricted to custom async tasks and management command so should not impact any user facing features. It also behind a feature flag. Tested locally.

Automated test coverage

Tests exist that the correct users are fetched for priming.

QA Plan

no QA

Rollback instructions

  • This PR can be reverted after deploy with no further considerations

Labels & Review

  • Risk label is set correctly
  • The set of people pinged as reviewers is appropriate for the level of risk of the change

Jtang-1 added 4 commits March 3, 2025 13:52
The theory is that there are smartlink exclusive workflows resulting in users never launching the app from the homescreen
and getting a full restore. This results in users with a cold cache.
@Jtang-1 Jtang-1 requested review from a team, MartinRiese, minhaminha, Robert-Costello and esoergel and removed request for a team and Robert-Costello March 3, 2025 23:16
@Jtang-1 Jtang-1 marked this pull request as ready for review March 3, 2025 23:17
@Jtang-1
Copy link
Contributor Author

Jtang-1 commented Mar 3, 2025

@coderabbitai review

Copy link

coderabbitai bot commented Mar 3, 2025

Walkthrough

The changes update the source of several functions used for priming and clearing Formplayer databases by moving their definitions from the old module (custom/covid/tasks) to a new module (custom/formplayer/restore_priming). Import statements in command files and tests have been adjusted accordingly without impacting the commands’ logic. A new static toggle (PRIME_FORMPLAYER_DBS_BHA) has been added for controlling domain inclusion in priming tasks. Additionally, a new periodic task (bha_prime_formplayer_dbs) has been implemented to run daily. This task retrieves enabled domains via the new toggle, calculates a synchronization window, obtains eligible users using the updated helper function, and then calls the priming function asynchronously using Celery. The old implementations in the deprecated module have been removed, and the new module now encapsulates the error handling and retry mechanisms necessary for the user synchronization functionality.

Sequence Diagram(s)

sequenceDiagram
    participant Scheduler as Celery Beat
    participant Task as bha_prime_formplayer_dbs Task
    participant Toggle as PRIME_FORMPLAYER_DBS_BHA
    participant UserFetcher as get_users_for_priming
    participant Worker as prime_formplayer_db_for_user

    Scheduler->>Task: Trigger daily execution
    Task->>Toggle: Retrieve enabled domains
    Toggle-->>Task: Return list of domains
    Task->>Task: Calculate sync date window
    Task->>UserFetcher: Get eligible users for each domain
    UserFetcher-->>Task: Return list of user IDs
    Task->>Worker: Asynchronously invoke prime_formplayer_db_for_user for each user
    Worker-->>Task: Process synchronization and clear DB if needed
Loading
✨ Finishing Touches
  • 📝 Generate Docstrings

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
corehq/toggles/__init__.py (1)

2295-2302: Add descriptive docstring or description parameter.

While defining a new toggle is straightforward, consider adding a brief description argument to provide context on how or why this toggle should be used. This helps future developers understand its purpose without having to check external documentation.

custom/bha/tasks.py (3)

17-21: Use environment variables for rate-limit.

USH_PRIME_RESTORE_RATE_LIMIT = 100 is a reasonable default. Ensure it is documented or easily overridden to facilitate quick adjustments in production if needed.


27-29: Promote consistent naming of tasks.

bha_prime_formplayer_dbs is descriptive, but consider a docstring clarifying its purpose, for instance: "This daily task primes the Formplayer DB for BHA domains enabled under the toggle."


30-35: Validate user set retrieval and task queue load.

Looping over domains, then users, can generate many subtasks. Ensure you monitor the queue load to prevent backlogs. Also confirm that large volumes of async tasks won’t exceed Celery concurrency limits or overshadow other tasks.

custom/covid/tasks.py (1)

12-12: Remove unused imports if possible.

After deleting the local implementation of the priming functions, confirm all references to FormplayerAPIException, clear_user_data, or others remain necessary and used in this module. Cleaning imports reduces confusion for future maintainers.

custom/formplayer/restore_priming.py (3)

31-52: Exception handling with retries
The approach of retrying on generic exceptions and explicitly handling FormplayerAPIException is solid. Consider logging the actual exception message and more context (e.g., user/domain) to help diagnose transient errors.


74-82: Guard against missing CouchUser
CouchUser.get_by_user_id() can return None if the user is missing or deleted, but the code tries to use username. Consider adding a check to avoid a NoneType error if the user is not found.


84-112: Large dataset query performance
Queries on SyncLogSQL with .filter() and .distinct() can become expensive if the table grows large. Ensure proper indexing on fields like domain and date to maintain performance.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba6be76 and a94d2eb.

📒 Files selected for processing (7)
  • corehq/apps/formplayer_api/management/commands/clear_formplayer_dbs.py (1 hunks)
  • corehq/apps/formplayer_api/management/commands/prime_formplayer_restores.py (1 hunks)
  • corehq/toggles/__init__.py (1 hunks)
  • custom/bha/tasks.py (1 hunks)
  • custom/covid/tasks.py (1 hunks)
  • custom/covid/tests/test_prime_restore.py (2 hunks)
  • custom/formplayer/restore_priming.py (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • corehq/apps/formplayer_api/management/commands/clear_formplayer_dbs.py
  • custom/covid/tests/test_prime_restore.py
  • corehq/apps/formplayer_api/management/commands/prime_formplayer_restores.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: - Review the code following best practises and st...

**/*.py: - Review the code following best practises and standards

  • custom/bha/tasks.py
  • corehq/toggles/__init__.py
  • custom/formplayer/restore_priming.py
  • custom/covid/tasks.py
⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: tests (python-sharded-and-javascript)
  • GitHub Check: tests (python, bf)
  • GitHub Check: tests (python, 6a)
  • GitHub Check: tests (python, 05)
  • GitHub Check: test
🔇 Additional comments (11)
custom/bha/tasks.py (4)

1-6: Validate date/time imports and time zone handling.

It's good to see usage of datetime and relativedelta. However, ensure that UTC is the desired timezone throughout, and consider using timezone-aware datetime objects (e.g., datetime.now(tz=utc) or Django’s timezone.now()) to avoid potential confusion if future code runs in different time zones.


8-10: Leverage global queue settings consistently.

The code designates ush_background_tasks as the queue. Confirm that other tasks that prime or manage Formplayer also use consistent queue naming, or specify a rationale for splitting tasks across multiple queues.


11-14: Confirm cross-file references.

The imports from custom.formplayer.restore_priming look correct. Verify that prime_formplayer_db_for_user and get_users_for_priming references remain consistent if the code is reorganized again, to avoid broken dependencies.


22-24: Recheck scheduled run times vs. user workflows.

The TASK_WINDOW_CUTOFF_HOUR = 14 (2PM UTC) gives a 2-hour window for completion. Double check with real usage patterns to confirm it fully covers the prime window for BHA daily usage.

custom/covid/tasks.py (1)

16-20: Validate cross-module migrations.

These lines import the new helper functions from custom.formplayer.restore_priming. Make sure the removed functionality in this file is fully covered there, including error handling or case load checks, so that no logic remains accidentally lost or partially duplicated.

custom/formplayer/restore_priming.py (6)

1-4: All imports are standard and correct
No issues with the usage; everything is consistent with best practices.


5-9: Dependent library imports
Everything is properly referenced and organized.


10-17: Additional imports
No concerns here. Imports align well with the tasks’ requirements.


19-20: Review the default rate limit
Confirm that the default USH_PRIME_RESTORE_RATE_LIMIT of 100 is appropriate for your environment. If the number of Celery tasks spikes above this threshold, tasks may be excessively delayed.


22-30: Time-based cutoff logic
datetime.utcnow().hour >= task_cutoff_hour may cause confusion if your team specifically expects local time. Make sure UTC-based cutoff aligns with your operational requirements.


54-72: Clear function consistency
clear_formplayer_db_for_user parallels prime_formplayer_db_for_user in design and retry logic. This consistency simplifies maintenance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant