Replies: 3 comments 2 replies
-
BTW, this is something that should have improved significantly recently, thanks to the back-end work done by @lamppu.
Yes, displaying time series in general is on our internal agenda as well. I'd also like to see e.g. how analyzer and scan times evolved across runs on a given repository. |
Beta Was this translation helpful? Give feedback.
-
Gathering statistics of various kinds has always been a goal for ORT Server. Your solution idea is similar to something I had in mind regarding a general optimization of the database: Currently, the database schema is rather normalized, and most entities are deduplicated. Especially if the database grows in size, this is likely to have a negative impact on ORT runs, since INSERT operations will become slower (due to the additional queries to check for the presence of entities during deduplication). To prevent this, there could be different schemas: one that is rather temporary for ORT runs currently in progress, and one storing the relevant results. In the first schema, no deduplication is done, but all data gathered during a run is just inserted. This should be fast, and it should also be easy from a processing step to access the data of previous steps from the current run. After the completion of the run or based on a periodic schedule, the run schema is transformed asynchronously into the results schema. This schema is designed to support queries for the relevant statistics efficiently. The transformation can be expensive, but it is done in background, so that users should not be impacted. It could even be possible to use different databases for these purposes or maybe specialized NoSQL databases if required by use cases. |
Beta Was this translation helpful? Give feedback.
-
I agree with the general idea to have a denormalized read model for these kinds of requests. For the implementation we should try to keep it simple, for example, I would like to avoid the mentioned hybrid approach. |
Beta Was this translation helpful? Give feedback.
-
Motivation
Pain: Currently, the ORT Server UI dashboard displays key statistics such as the number of vulnerabilities, issues, rule violations, and packages. However, retrieving this data from the backend is quite slow, as it requires extensive queries to the database. This results in users having to wait several seconds for the UI activity indicators to disappear before the numbers are displayed.
Once the numbers for number of vulnerabilities, issues, rule violations and packages are there, it is nice to look at, but wouldn't it be even nicer to see how these numbers have developed over time, e.g. by displaying a historical trend graph in the background?
From


To
High Level Solution Idea
Add a basic event log to the system
We could start by introducing a basic event log (as a database table) to track important system activities. Initial events could be:
Whenever a scan is started or finished, such an event is written to the event log.
Asynchronously process these events (batch processing)
We could have a periodic job that scans the event log for events that are not yet processed and processes them.
Processing could mean:
It should be possible to also change the batch processing code, reset the batch-processing database tables and re-process all (or parts) of the event log.
Optimize UI Dashboard performance by using this pre-calculated numbers
To ensure both responsiveness and accuracy, the UI dashboard can leverage a hybrid approach inspired by Lambda Architecture:
Feedback & Next Steps
I’d love to hear your thoughts on this approach. Are there any potential challenges or alternative solutions we should consider? Would this enhancement align with ORT Server’s long-term goals?
Beta Was this translation helpful? Give feedback.
All reactions