Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline, execution and schedule views performance issues fix #547

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

eea03
Copy link

@eea03 eea03 commented Nov 9, 2015

Pipeline, execution and schedule views were implemented very uneffective. When creating view (table), hundreds of selects were unnecessary executed in database for such trivial actions as check of permission for buttons, etc.
This is unacceptable in production environment where database is on separate server and thus delays occur.
With such a high count of selects (120 selects for table with 20 entries per page per refresh) we experienced heavy performance issues - refresh of table ~10 seconds.

Simple solution - caches were introduced to store objects which are so often queried.
This way, no redundant selects are executed, performance has gone up rapidly.

However, this is still not optimal solution as these caches are per user (as already caches introduced before) and are influenced by objects that user cannot even see.
There is concept in all these views, that refresher checks if any pipelines / executions have been changed in some time. However, there is no regard to access permissions. If any pipeline / execution has been changed, even if user cannot see it, table and caches are reloaded. In fact only changes of objects that user can access should trigger refresh.

Moreover, cache should be static and general for all users and not per user. It should be loaded at start and then dynamically changed when objects change and then in some larger intervals - not every time somtehing changes in the database.

However, the described problems are much more complicated as there is no architecture to support the right way of doing this.
Introduced caches solve the most pressing performance issues very effectively.

Resolves #546
partly resolves #544

eea03 added 2 commits November 5, 2015 17:00

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
… to improve performance. Until now, unreasonable count of selects were pointlessly generated every time table was refreshed, even though data were not changed and there was no point to reload them
…al selects to retrieve Schedules. Fixed bug with refresher (table wasn't refreshing after schedule detail window opened and closed)
@tomas-knap
Copy link

Just a note to consider ..
Some of those views utilize DbCachedSource

https://github.com/UnifiedViews/Core/blob/feature/pipelineAndExecutionViewsPerformance/frontend/src/main/java/cz/cuni/mff/xrg/odcs/frontend/doa/container/db/DbCachedSource.java

This class in fact already "do" some caching. It may be reasonable to do the changes there rather than add another caching layer for each component separately.

@tomas-knap
Copy link

From eea03: Yes I am aware of the existing cache mechanism. However these caches back up the table data itself and they use the database views and corresponding Java objects (PipelineView / ExecutionView).
The methods for evaluating permissions and others causing so much selects require single Pipeline / Execution objects.
In order to reuse the existing caches we would have to reimplement quite a big amount of code.

I am aware that this is not an optimal solution to have 2 caches but at least in eDemo we need a quick fix of these performance issues so we implemented it this way

@tomas-knap tomas-knap added this to the Release v2.4.0 milestone Dec 8, 2015
@tomas-knap
Copy link

Bug #546 is fixed in commit 76ec644, which is partly based on the adjustments proposedhere. Nevertheless, the whole #547 is not yet accepted as this not only fixed this issue but also introduces caches, which are not aligned with the existing approach for caching scheduled events.

So this pull request is about caching scheduled events, pipelines, pipeline executions. Before accepting in 2.3.X+, it has to be examined, whether it can be better aligned with the existing approach, so that second level of caches is NOT introduced

@tomas-knap tomas-knap removed this from the Release v2.4.0 milestone Dec 27, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants