Skip to content

Cron interface for generated reports

Ashwin Date edited this page May 8, 2019 · 2 revisions

Background

In several cases, the report needs a complicated query to get the necessary data. In such cases the reports become highly non-performant, since there are complex queries on multiple tables to show aggregated data.

E.g Shika has a “course report” with the below columns -

Course_id | course_name | total enrolled users | users completed the course | total_time_spent by users

Here the total time spent, comes from the lesson_track table, which stores the timespent by each user for each attempt. Total enrolled users count comes from the enrolled_users table, where there are entries for the user_id and course_id. Users completed the course count comes the course_track table, where there are separate entries for course_id and user_id

When there are hundreds of courses with thousands of users, if the report queries are fired in real-time the server will get heavily loaded.

To avoid, this a solution is proposed to run the heavy queries periodically and store the output in a separate table. The report is then shown on this table. This will reduce the load considerably and will allow for complicated reports with no noticeable performance degradation.

Proposed Solution

  • Plugins may implement a install SQL file which creates the necessary tables upon installation

  • Plugins that need this feature will implement a new method “summarize”

  • This method runs the complex queries / processing and stores the normalised output in the table In case the data to normalise is a lot and cannot be processed in a single pass, the summarize method can return a variable containing state information. The reports framework will serialize and save this state information and send it on the next iteration

  • In TJ Reports introduce a new cron CLI tjreports-summarizer.php --plugins=plugin1,plugin2

  • The plugins argument is optional. If values are specified for this array the cron will summarize the data only for those plugins. This will allow the crons to be set in a staggered way so that all the plugins don’t run at once.

  • The cron triggers the function “summarize” from the specified plugins or for all plugins if the plugins variable is not specified. The previously saved state information for the respective is passed to this method.

  • The cron will also save the state information returned by the summarize() method

  • The summarize() method should return false if there are no more records to process.

  • Plugins can also implement a clean() method that will be used to truncate the tables. The method should return an array of table names that will be truncated. To avoid accidentally deleting tables containing authoritative data, only table names starting with #__tjreports_pluginname will be truncated.

Tables

#__tjreports_crons

Column Name Details
id Primary key
plugin varchar(255) (Name of the plugin triggered)
last_executed Datetime (The time when this plugin last executed
cron_state Text (Serialized string containing the state information from the cron run)