Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review performance #338

Closed
tcompa opened this issue Nov 21, 2023 · 11 comments
Closed

Review performance #338

tcompa opened this issue Nov 21, 2023 · 11 comments

Comments

@tcompa
Copy link
Collaborator

tcompa commented Nov 21, 2023

Just for-the-record, here are some rough timings for loading three pages based on the test DB we are using internally:

User's jobs page
3.65 MB, 1500 ms

User's projects page
1.24 MB, 700 ms

Workflow 63
85 KB, 860 ms

We should start some more systematic profiling. The first goal is to understand what is the overhead on top of the REST API calls. Depending on where most of the time is spent, we can come up with multiple improvements. A first brainstorming:

  • Use pagination
  • Use Slim versions of the response models (in fractal-server), when endpoints return lists of objects that may be long
  • Make parallel calls to fractal-server (e.g. in the jobs page, where there are three of them for fetching jobs workflows and datasets)
  • Make DB data more homogeneous, to reduce time spent in custom handling of missing/invalid data (ref Prepare script for updating DB between 1.4.0 and 1.4.1 fractal-server#1010)
  • ... (profile the page loading, and then decide where improvements are useful) ...
@zonia3000
Copy link
Collaborator

The user's job page performs 4 API calls with the following timing:

  • /api/v1/project: 162 ms
  • /api/v1/job: 268 ms
  • /api/v1/workflow: 195 ms
  • /api/v1/dataset: 219 ms

Total API calls time: 844ms

Attempting to run the API calls in parallel didn't result in an appreciable performance change.

I've run the Firefox profiler but no specific issue was detected. The component having the bigger impact was the TimestampBadge (11ms). Since it had a very limited logic I moved it directly inside the table, but this of course is not a major improvement.

However, I added the spinner and I think it improves the user experience when the page takes some time to load.

@tcompa
Copy link
Collaborator Author

tcompa commented Nov 21, 2023

Thanks!

Total API calls time: 844ms

Could you also report the total page-loading time?
Is it similar to the 1500 ms reported above?

@zonia3000
Copy link
Collaborator

Page loading with last commit in jobs branch for me is around 1.2-1.3 seconds

@tcompa
Copy link
Collaborator Author

tcompa commented Nov 22, 2023

Page loading with last commit in jobs branch for me is around 1.2-1.3 seconds

OK, thanks. Then there are about 400 ms spent in fractal-web, right?

Question: can you already say whether this will improve when all jobs have correct properties input_dataset_dump, output_dataset_dump and workflow_dump? Or is it unrelated?

(for context: the goal here is not to extra-optimize the loading of this specific page, but rather to get started with a broader discussion of where the next performance bottleneck will come from)

@zonia3000
Copy link
Collaborator

We have to consider separately the time spent by Svelte backend and the time spent by frontend.

Here a profiling of the backend:

profiling-backend

As you can see most of the time is spent waiting for a response (I forgot to consider also the whoami call). The heaviest part is the final page rendering (120ms).

Then the browser does some work too. Here is a profiling of the frontend:

Screenshot from 2023-11-22 10-03-37

Most of the work of the JobList component is composed by if and foreach evaluation and I don't see an easy way to make it faster.

JSON parse have a bigger impact on the backend than in the frontend, but I don't think that creating lighter payloads will be a huge improvement. I'm more in favor of introducing pagination, as I think it will become necessary at some point in the future when an instance will have a lot of old jobs in the database, but for me the biggest point here is reducing the number of calls. Could we consider to have a special endpoint for this? I can also do some more tests with parallel calls.

@zonia3000
Copy link
Collaborator

I think we could also consider to remove the whoami call that is done for each page and redirect to login whenever an API endpoint returns 401.

@tcompa
Copy link
Collaborator Author

tcompa commented Nov 22, 2023

Thanks for the profiling!

JSON parse have a bigger impact on the backend than in the frontend, but I don't think that creating lighter payloads will be a huge improvement.

Good to hear.

I'm more in favor of introducing pagination, as I think it will become necessary at some point in the future when an instance will have a lot of old jobs in the database,

Agreed. This may not come up very soon, on the fractal-server side, but we'll need to explore it.

but for me the biggest point here is reducing the number of calls. Could we consider to have a special endpoint for this?

OK, let's see what we can do here. Comments based on our discussion 5 minutes ago:

  1. Let's keep /api/v1/project/, for the moment, even if it's only used to associate a project ID to its name
  2. Let's remove /api/v1/workflow/ and /api/v1/dataset/, in view of Prepare script for updating DB between 1.4.0 and 1.4.1 fractal-server#1010. For the moment we can stick with an "undefined" name (because of missing/invalid data in the DB).
  3. Let's postpone the option of removing the whoami call.

Removing two API calls will have a significant impact, and that's all we need for the current scale (that is, number of jobs/projects/datasets/workflows).

@tcompa
Copy link
Collaborator Author

tcompa commented Nov 22, 2023

It would be great to see a similar profiling also for the other two pages

  1. user's projects
  2. a specific workflow

Maybe we'll just end up saying "that's fine for now", but at least we have a baseline for future improvement.

@zonia3000
Copy link
Collaborator

Removing the 2 calls to workflows and datasets the jobs page takes 555 ms

@jluethi
Copy link
Collaborator

jluethi commented Nov 23, 2023

That's awesome, thanks a lot for this overview! And what an improvement with those 2 dropped calls!

Also, +1 for testing user's projects & specific workflow pages. That would have caught the issue with the scaling with the square of elements we had a few weeks back and great if we get a baseline for current performance on some real-life data :)

@tcompa
Copy link
Collaborator Author

tcompa commented Dec 4, 2023

Closing in favor of #361

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants