-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: [FC-0056] Implement Sidebar Navigation #34457
feat: [FC-0056] Implement Sidebar Navigation #34457
Conversation
Thanks for the pull request, @NiedielnitsevIvan! Please note that it may take us up to several weeks or months to complete a review and merge your PR. Feel free to add as much of the following information to the ticket as you can:
All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here. Please let us know once your PR is ready for our review and all tests are green. |
6632124
to
bb1ba59
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote some questions and comments. Overall, I think this is a pretty pragmatic PR that balances correctness with performance without requiring too much new code. My main concerns are related to the cache cleanup performance and what happens when a person's user partition group changes for the purposes of Unit-level access (e.g. enrollment track).
I would like to know what kinds of response times you're seeing locally on large courses, and where the bottlenecks are presently. You don't have to give detailed profiling traces or anything–I'm just trying to understand if you know generally how this behaves for a larger course in terms of response time and database queries made. Traditionally completion code was one of the slowest things because of n+1 queries, but it looks like you're doing a batch fetch for that, which is great.
Thank you!
@@ -44,6 +48,11 @@ | |||
OutlineTabView.as_view(), | |||
name='outline-tab' | |||
), | |||
re_path( | |||
fr'sidebar/{settings.COURSE_KEY_PATTERN}', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a name we could pick that better reflects what it is from the server's point of view (e.g. course navigation), rather than where it's being placed in the UI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
completions = BlockCompletion.objects.filter(user=self.request.user, context_key=course_key).values_list( | ||
'block_key', | ||
'completion', | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fuzzy on how BlockCompletion works, but do we need everything in the course for this user? Can we just grab completion data for Units and up? Or do we need all the low level stuff because we're recalculating completion on the fly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, BlockCompletion doesn't store data for Units, only for lower-level blocks, so we have to grab all the data for the user in the course and calculate Completions for the level of Units and up on our own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bleh. Yeah, that makes sense. Unfortunate though.
return list(filter( | ||
lambda seq_data: seq_data['id'] in available_sequence_ids or seq_data['type'] != 'sequential', | ||
course_sequences | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (optional): In general, please prefer list comprehensions to list + filter, since they're more commonly used and familiar to Python devs.
children = block.get('children', []) | ||
child_classes = {child.get('type') for child in children} | ||
new_class = 'other' | ||
icon_call_priority = ['video', 'problem'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment (not required): If "problem" is a proxy for "is this something that you as a student need to submit", we could scan through the classes to see if have the has_score
class attribute set to True
. That would catch things like ORA, drag and drop, etc. Another oddball case is library_content
which is just a container, but is used for problems the vast majority of the time.
return list(filter( | ||
lambda section_data: section_data['id'] in available_section_ids, course_sections | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (optional): In general, please prefer list comprehensions to list + filter, since they're more commonly used and familiar to Python devs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I agree with you that the list comprehensions syntax is more familiar and readable, but in this case I deliberately used list + filter because it has better performance, which can be useful for large courses.
However, if you are in favor of the list comprehensions approach, I can change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I didn't realize it was done for performance reasons. How big of a difference does it make on large courses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took measurements and the difference is actually 200-300 ms on average, which can be explained by the error. Therefore, it makes sense to return to list comprehensions for the sake of readability.
""" | ||
if 'children' in block: | ||
block['children'] = [self.mark_complete_recursive(child) for child in block['children']] | ||
block['complete'] = all(child['complete'] for child in block['children'] if child['type'] != 'discussion') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be looking for the class attribute completion_mode
on the relevant XBlock classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because in our case, block is not an XBlock object, but a dict with block data, and therefore we cannot get the value of the completion_mode
field here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, let me ask this another way: Why is the "discussion" block explicitly excluded here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "discussion" block is explicitly excluded here because it cannot be marked as completed, in which case units that have a "discussion" block will never be marked as completed either.
I also considered the option of checking the complete status for blocks that have has_score=True
, but in this case it would only apply to problems, which is also not entirely correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this get the right tags then?
from xblock.core import XBlock
from xblock.completable import XBlockCompletionMode
completable_tags = {
tag for (tag, cls) in XBlock.load_classes()
if XBlockCompletionMode.get_mode(cls) == XBlockCompletionMode.COMPLETABLE
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thank you very much, I didn't know that we could get completable block types in this way.
Added this change.
for section_data in course_sections: | ||
section_data['children'] = self.get_available_sequences( | ||
user_course_outline, | ||
section_data.get('children', []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are definitely necessary? I'm surprised the API building course_blocks would return no children
attribute, rather than returning an empty list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not fully understand what you mean, can you explain in more detail please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected section_data
to always have a children
key, even if the value is []
. But the fact that you're doing section_data.get('children', [])
implies that sometimes that key is missing entirely. That surprised me.
) | ||
accessible_sequence_ids = {str(usage_key) for usage_key in user_course_outline.accessible_sequences} | ||
for sequence_data in section_data['children']: | ||
sequence_data['accessible'] = sequence_data['id'] in accessible_sequence_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment explaining the difference between accessible and available in the above code. I've stared at outline code long enough to get it, but it's probably going to confuse developers looking at this for the first time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I think it can be fixed by renaming get_available_sections
to get_accessible_sections
and so on.
@@ -141,6 +142,7 @@ def listen_for_course_publish(sender, course_key, **kwargs): # pylint: disable= | |||
# register special exams asynchronously after the data is ready | |||
course_key_str = str(course_key) | |||
transaction.on_commit(lambda: update_special_exams_and_publish.delay(course_key_str)) | |||
drop_course_sidebar_blocks_cache(course_key_str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be problematic at scale, and that it's better to rely on the version changing in the key for invalidation rather than trying to iterate through the whole list of cache keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, after adding the course version to the cache key.
@ormsbee Please check this the performance analysis on tutor dev - https://raccoongang.atlassian.net/wiki/external/MDhhMjdlMDhjODVkNDBjYjkzZjAzMDdhOGZmMGM2MTk |
@GlugovGrGlib: Thank you! Do you have a broad sense of what the bottlenecks are for those really large courses? |
eab71a1
to
9a6afaa
Compare
@GlugovGrGlib: Just as a heads up, when I run this locally on my laptop with the large test course and try loading just this REST endpoint in isolation, I get speeds of about 4.7 seconds for staff and around 6 seconds for a student. That's about 3X faster than when I try to hit those endpoints on the sandbox (I was loading the REST endpoint directly there as well, instead of using the MFE, since I didn't want it to get slowed down by other requests). And that difference is with the debug toolbar left on. Is there a possibility that course blocks caching isn't properly configured on the sandbox? |
Or maybe profiling was left on? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple small requests (one optional), but I'd still like to know the answers to the questions I left in my last review before approving.
for higher_class in icon_call_priority: | ||
if higher_class in child_classes: | ||
new_class = higher_class | ||
return new_class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: When I first read icon_call_priority
, I assumed that the things that came first had higher priority, rather than the things that came last. I think this might read more clearly if you use returns instead of a new_class
var, so something like:
@staticmethod
def get_vertical_icon_class(block):
"""
Get the icon class for a vertical block based on its children.
"""
children = block.get('children', [])
child_classes = {child.get('type') for child in children}
icon_call_priority = ['problem', 'video']
for item_type in icon_call_priority:
if item_type in child_classes:
return item_type
return 'other' # default
You could also make it more explicit/obvious like this, since there are only a few classes that matter at the moment:
@staticmethod
def get_vertical_icon_class(block):
"""
Get the icon class for a vertical block based on its children.
"""
children = block.get('children', [])
child_classes = {child.get('type') for child in children}
if 'problem' in child_classes:
return 'problem'
if 'video' in child_classes:
return 'video'
return 'other'
Either way, please describe the intended ordering in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
completions = BlockCompletion.objects.filter(user=self.request.user, context_key=course_key).values_list( | ||
'block_key', | ||
'completion', | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bleh. Yeah, that makes sense. Unfortunate though.
if cached: | ||
# If the data was cached, we need to mark the blocks as complete or not complete. | ||
course_blocks = self.mark_complete_recursive(course_blocks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain in a comment why we need to mark the blocks as complete only when we get a cache hit, and not when there's a cache miss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment updated.
@ormsbee Hello! One of the solutions to make the cache update instantly or much more often is to change the |
@NiedielnitsevIvan: Is that because the collect phase of block transformers takes time to run, so the version you get immediately after making a change is still the course blocks of the old course, and then that gets cached by this new code? |
That's right, because the course version changes, and the blocks returned from |
Sorry, I really didn't go into the depth of the specific code that is the slowest to execute for some time already. Last time we needed to troubleshoot these issues for a client, it was around 2020-2021, when there weren't Learning MFE, and in parallel with your inputs to this discourse post, we got similar results at the time. Additionally to this, recently we have tested fetching and processing the course structure to find the optimal course setup for a client, you might find those results insightful https://raccoongang.atlassian.net/wiki/external/NWJlYzYxYzRlYzRlNGE5YmFkYjkxYmE0ZTdlNTZjOWE.
I believe you were right, initially I was confused about waffle flag, as It should be used to turn off caching, but was implemented the other way. Latter Ivan inverted the behavior for waffle flag, and the performance for loading from cache on the sandbox was enhanced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, because the course version changes, and the blocks returned from get_course_outline_block_tree are still old, so cache invalidating by course version doesn't work in this case.
In that case, can you grab the cache value from the BlockStructureModel
(I think you might have to use the UsageKey of the root Course block instead of the CourseKey)?
edx-platform/openedx/core/djangoapps/content/block_structure/models.py
Lines 168 to 184 in 3852358
data_usage_key = UsageKeyWithRunField( | |
'Identifier of the data being collected.', | |
blank=False, | |
max_length=255, | |
unique=True, | |
) | |
data_version = models.CharField( | |
'Version of the data at the time of collection.', | |
blank=True, | |
null=True, | |
max_length=255, | |
) | |
data_edit_timestamp = models.DateTimeField( | |
'Edit timestamp of the data at the time of collection.', | |
blank=True, | |
null=True, | |
) |
And then fall back to the modulestore's root CourseBlock version if that's not available for some reason? Does that resolve the cache invalidation issue?
COMPLETABLE_BLOCK_TYPES = { | ||
block_type for (block_type, block_cls) in XBlock.load_classes() | ||
if XBlockCompletionMode.get_mode(block_cls) == XBlockCompletionMode.COMPLETABLE | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I should have specified when I gave this example: Please don't put this at the module level. You can put it in a function and wrap it in a @functools.cache
decorator (or a method and wrap in @functools.cached_property
. But we've had problems in the past where XBlock machinery was being initialized earlier than expected as side-effect of module-level statements like this, and it was hard to track down. Putting it in a function or method will help to make sure we don't actually invoke it until the view itself is executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know about this problem.
Fixed it.
ef36dca
to
9edc20f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please squash your commits and consider looking into the BlockStructureModel
thing I mentioned w.r.t. cache invalidation as a followup. Thank you.
9edc20f
to
8f23703
Compare
@NiedielnitsevIvan 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future. |
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
2U Release Notice: This PR has been deployed to the edX production environment. |
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
2U Release Notice: This PR has been deployed to the edX production environment. |
This PR is the part of the following product feature - openedx/platform-roadmap#329 |
Settings
Description
This PR addresses to the need to add an API for Sidebar Navigation that returns the course structure with sections, subsections, and units, according to user rights.
To improve the performance of the API, was added caching of the course structure for the user, which makes it much easier to calculate the block structure for the user at each request. However, there may be cases when this caching can lead to an overflow of the cache storage in high-loaded LMS with large courses, so the corresponding flag "courseware. disable_navigation_sidebar_blocks_caching" was added so that this caching can be disabled.
Testing instructions