Skip to content

Latest commit

 

History

History
503 lines (362 loc) · 17.7 KB

ch04.5-rolling-file-support.md

File metadata and controls

503 lines (362 loc) · 17.7 KB

Read Prev

Adding Rolling File Support

The code for the entire chapter can be found at the src/chapter_04.5 directory

Since the beginning of Chapter 4, where we introduced logtar, our own logging library, we have been discussing the rolling file support. This involves creating new files based on the rolling configuration set in config.json or provided via the LogConfig.from_json method. In this chapter, we will finish building this feature.

Rolling features

In our RollingConfig class, we defined 2 private variables - #time_threshold and #size_threshold

// file: lib/config/rolling-config.js

class RollingConfig {
    #time_threshold = RollingTimeOptions.Hourly;
    #size_threshold = RollingSizeOptions.FiveMB;

    ...
}

#time_threshold

This defines how often do we need to create a new file. For example, if the set value is RollingTimeOptions.Hourly that means, we'll be creating a new log file every hour.

#size_threshold

This property tells what should be the maximum size of a log file. If it exceeds the size, we'll simply create a new log file.

Let's start coding.

The rolling_check() method

We're going to introduce a new private method Logger.rolling_check, that is going to take care of the rolling logic.

// file: lib/logger.js

class Logger {
    ...
    /**
     * Checks if the current log file needs to be rolled over.
     */
    async #rolling_check() {
        const { size_threshold, time_threshold } = this.#config.rolling_config;

        const { size, birthtimeMs } = await this.#log_file_handle.stat();
        const current_time = new Date().getTime();

        if (size >= size_threshold || (current_time - birthtimeMs >= time_threshold * 1000)) {
            await this.#log_file_handle.close();
            await this.init();
        }
    }
    ...
}

file_handle.stat()

Let's go through the code line by line:

const { size_threshold, time_threshold } = this.#config.rolling_config;

This line above destructures the size_threshold and time_threshold properties from the rolling_config member object.

const { size, birthtimeMs } = await this.#log_file_handle.stat();

We're destructuring the size and the birthtimeMs properties from the file_handle.stat() method.

Node.js provides us with a stat method on the FileHandle class, that can be used to check various stats of a file. The stat method provides a lot of useful information about the current file/file_handle. Some of them are methods, and some are properties.

Let's take a look at some of the useful information provided by file_handle.stat():

// Returns true if the `file_handle` is a file. Fails if the file handle is
// pointing to a directory instead
file_handle.stat().isFile();

// Returns true if this `file_handle` refers to a socket
file_handle.stat().isSocket();

// Numeric indentifier of the device which contains this file.
file_handle.stat().dev;

// The size of the file in bytes.
file_handle.stat().size;

// Time since this file was last accessed, in `ms`
file_handle.stat().atimeMs;

// Time since the file was last modified, in `ms`
file_handle.stat().mtimeMs;

// Time since the last time it's file status changed.
// File status is a two-byte code that indicates how a file operation completed;
// either successfully, or with some form of error.
file_handle.stat().ctimeMs;

// Time since this file was created, in `ms`
file_handle.stat().birthtimeMs;

// Same methods as above, but the value is in `ns` (nanoseconds)
// 1 ns = 1,000,000 ms
file_handle.stat().atimeNs;
file_handle.stat().mtimeNs;
file_handle.stat().ctimeNs;
file_handle.stat().birthtimeMs;

// Same methods as above, but the type of this is `Date`. The actual javascript Date class
file_handle.stat().atime;
file_handle.stat().mtime;
file_handle.stat().ctime;
file_handle.stat().birthtime;

I've omitted the ones that are not of any use for us.

const current_time = new Date().getTime();

We are creating a new Date object and uses the getTime() method to obtain the current time in milliseconds since epoch.

if (size >= size_threshold || (current_time - birthtimeMs >= time_threshold)) {

We'll be creating a new file:

  • If the size of the log file is greater than or equal to the size_threshold, or
  • If the time difference between current_time and birthtimeMs (file creation time) is greater than or equal to the time_threshold.
await this.#log_file_handle.close();

Closes the file handle after waiting for any pending operation on the handle to complete. This is important. We should make sure if there are any pending operations, or the Node.js runtime hasn't flushed (written) the entire file, we're going to wait till then.

When everything is flushed, we're simply closing the file handle, so nothing in our application will access this file again.

await this.init();

We call the Logger.init method to create the new file again. Here's the code for init() just in case:

async init() {
    const log_dir_path = check_and_create_dir("logs");

    const file_name = this.#config.file_prefix + new Date().toISOString().replace(/[\.:]+/g, "-") + ".log";
    this.#log_file_handle = await fs.open(path.join(log_dir_path, file_name), "a+");
}

Calling the rolling_check method

Where are we going to call this method that we just created? Of-course, it's the Logger.#log method.

class Logger {
    ...
    async #log(message, log_level) {
        if (log_level < this.#config.level || !this.#log_file_handle.fd) {
            return;
        }

        await this.#rolling_check(); // Call the check method.

        // I've extracted all the writing functionality into a separate method
        await this.#write_to_handle(message, log_level);
    }
    ...
}

Here's the abstracted functionality in the Logger.#write_to_handle method.

class Logger {
    ...
     async #write_to_handle(message, log_level) {
        const date_iso = new Date().toISOString();
        const log_level_string = LogLevel.to_string(log_level);

        const log_message = `[${date_iso}] [${log_level_string}]: ${get_caller_info()} ${message}\n`;
        await this.#log_file_handle.write(log_message);
    }
    ...
}

Let's try to run the test.js file again.

// file: test.js

const { LogConfig } = require("./lib/config/log-config");
const { Logger } = require("./lib/logger");

async function init() {
  const logger = Logger.with_config(LogConfig.from_file("config.json"));
  await logger.init();
  return logger;
}

async function main() {
  const logger = await init();
  logger.info("Hello World!\n");
}

main();

Oops...

TypeError: Cannot read properties of undefined (reading 'split')
    at get_caller_info (/Users/ishtmeet/Code/logtard/lib/utils/helpers.js:24:36)
    at Logger.#write_to_handle (/Users/ishtmeet/Code/logtard/lib/logger.js:112:69)
    at Logger.#log (/Users/ishtmeet/Code/logtard/lib/logger.js:89:36)

A big gotcha!

You might have guessed what part of the code crashed. Yes, that's the get_caller_info function defined inside lib/utils/helpers.js. That function is in-charge of collecting the stack trace and returning the line that includes the caller. Here's how that function looks:

// file: lib/utils/helpers.js

function get_caller_info() {
  const error = {};
  Error.captureStackTrace(error);

  // this line fails.
  const caller_frame = error.stack.split("\n")[4];

  const meta_data = caller_frame.split("at ").pop();
  return meta_data;
}

But why? Shouldn't this code work properly? Well, there's a huge part of async in general that you should be aware of. For now let's add a console.log to check that the stack trace is.

function get_caller_info() {
    ...
    console.log(err.stack);
    const caller_frame = error.stack.split("\n")[4];
    ...
}

Here's what the stack trace looks like:

Error
    at get_caller_info (/Users/ishtmeet/Code/logtard/lib/utils/helpers.js:21:11)
    at Logger.#write_to_handle (/Users/ishtmeet/Code/logtard/lib/logger.js:113:69)
    at Logger.#log (/Users/ishtmeet/Code/logtard/lib/logger.js:90:36)

Where are the lines that refer to the logger.info and the main() function? They seem to be missing...

Stack traces across await points

An "asynchronous point" or an "await point" is a concept that refers to the moment when control is handed back to the JavaScript runtime during an asynchronous operation.

This point is marked by the await keyword or when a promise's then() or catch() method is called. At an await point, the runtime can execute other tasks while keeping track of the ongoing asynchronous operation. But this has something going behind the scenes.

When an async function is invoked and it encounters an await expression, the function execution is essentially paused and control is returned to the event loop. The event loop then continues processing other tasks in the queue. When the awaited asynchronous operation completes (e.g., a Promise resolves), the function is resumed from where it was paused.

The function kind of yields back to the runtime, saying - "Hey, please continue with the event loop and continue this function from this line whenever you get that data".

When an error occurs within an asynchronous function, the stack trace that is captured includes information up to the point where the asynchronous operation was initiated (i.e., where the await is placed). However, it does not include the full call stack leading up to the initial invocation of the asynchronous function. This is because the call stack gets unwound as control is returned to the event loop.

If an error occurs within an asynchronous function before the await expression, the stack trace will include information about where the error happened and all the calling functions leading up to that point.

If an error occurs after the await expression, the stack trace will only include information about the current function and the awaited function that was resumed after the asynchronous operation completed. It will not include the calling functions that led to the initial invocation of the asynchronous function.

This behavior is designed to make the program more efficient. But what's causing the issue in our code?

The culprit

// file: lib/logger.js

class Logger {
    ...
    async #log(message, log_level) {
        if (log_level < this.#config.level || !this.#log_file_handle.fd) {
            return;
        }

        // This is causing the issue
        await this.#rolling_check();
        await this.#write_to_handle(message, log_level);
    }

    ...
}

The async method #rolling_check() is the one that's causing the stack trace to trim all the required info.

When the runtime encounters the expression starting with await, it pauses the execution of that function and "unwind the stack" in a non-blocking manner. This means that the method will yield control back to the event loop, allowing other tasks to be processed while the awaited asynchronous operation completes.

So, all the information about who invoked the #log method is dropped. This is done to make the code more efficient. The more information the runtime needs to hold, the more memory it will occupy.

So how do we fix it? It's very simple. Write to the file_handle before we do the rolling_check. Here's the updated code:

// file: lib/logger.js

class Logger {
    ...
    async #log(message, log_level) {
        if (log_level < this.#config.level || !this.#log_file_handle.fd) {
            return;
        }

        await this.#write_to_handle(message, log_level);
        // Make sure we're doing the writing before checking for rolling creation
        await this.#rolling_check();
    }

    ...
}

Let's try to run test.js again. That works!

[2023-08-21T23:28:59.178Z] [INFO]: Logger.info (/Users/ishtmeet/Code/logtard/lib/logger.js:128:18) Hello World!

This is still not accurate. It's telling that the function which was responsible for writing to the file was Logger.info, however it is not correct.

Why is that? It's because we introduce a new method, this.#write_to_handle(), so the line which contains our actual invocation went more line down. To illustrate, let's print our stack trace.

function get_caller_info() {
    ...
    console.log(err.stack);
    const caller_frame = error.stack.split("\n")[4];
    ...
}

The output is:

1    Error
2        at get_caller_info (/Users/ishtmeet/Code/logtard/lib/utils/helpers.js:21:11)
->       at Logger.#write_to_handle (/Users/ishtmeet/Code/logtard/lib/logger.js:113:69)
4        at Logger.#log (/Users/ishtmeet/Code/logtard/lib/logger.js:89:36)
5        at Logger.info (/Users/ishtmeet/Code/logtard/lib/logger.js:128:18)
6        at main (/Users/ishtmeet/Code/logtard/test.js:13:12)

The line marked with -> shows the extra method which was pushed on the stack, because we abstracted all the functionality into that method. To fix that, we're going to update the code to get the 6th line instead of the 5th

// file: lib/utils/helpers.js

function get_caller_info() {
    ...
    const caller_frame = error.stack.split("\n")[5];
    ...
}

Let's test whether it works one last time:

[2023-08-21T23:37:11.919Z] [INFO]: main (/Users/ishtmeet/Code/logtard/test.js:13:12) Hello World!

Everything is working as expected.

Testing the new Log file creation

Update our RollingTimeOptions class to include a new FiveSeconds as a temporary variable, which we are going to remove after testing. Also make sure to add it inside the static assert() method, or this method will reject a value it doesn't know.

// file: lib/utils/rolling-options.js
class RollingTimeOptions {
    static FiveSeconds = 5; // Every 5 seconds
    static Minutely = 60; // Every 60 seconds
    ...

    static assert(time_option) {
        // Add `this.FiveSeconds` for validation
        if (![this.FiveSeconds, this.Minutely, this.Hourly, this.Daily, this.Weekly, this.Monthly, this.Yearly].includes(time_option)) {
            throw new Error(
                `time_option must be an instance of RollingConfig. Unsupported param ${JSON.stringify(time_option)}`
            );
        }
    }
}

The config.json should look like this

// file: config.json

{
    "level": 0, // Log Debug messages and above
    "log_prefix": "LogTar_",
    "rolling_config": {
        "size_threshold": 1048576, // 1 MB:  1024 * 1024
        "time_threshold": 5 // Create a new file every 5 seconds
    }
}

We're going to test the time_threshold for now. I've set it to 5 seconds, so it creates a new log file every time the file_handle.stat().birthtimeMs > 5000.

Inside test.js run a log statement every 1 second:

// file: test.js
...

async function main() {
    const logger = await init()
    setInterval(() => {
        logger.critical("This is critical");
    }, 1000);
}

...

Run the program. You'll notice a new log file created every 5 seconds!

We are going to test the size_threshold rolling option now. Remove the newly created FiveSeconds static member from RollingTimeOptions.

Update your code in test.js:

// file: test.js

async function main() {
  const logger = await init();
  setInterval(() => {
    logger.critical("Hi there");
  }, 20); // run every 20 milliseconds
}

The config.json should specify the size_threshold as 20428 bytes (20 KB).

{
    "level": 0,
    "log_prefix": "LogTar_",
    "rolling_config": {
        // max size should be 20 KB (20 * 1024 bytes)
        "size_threshold": 20480,
        "time_threshold": 60
    }
}

Run the test.js script. You'll notice a new file created every now and then. Let's check the sizes of those log files.

$ ls -al ./logs

.rw-r--r--  10k ishtmeet 22 Aug 06:11 Logtar_2023-08-22T00-41-58-132Z.log
.rw-r--r--  10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-41-59-232Z.log
.rw-r--r--  10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-01-101Z.log
.rw-r--r--  10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-02-139Z.log
.rw-r--r--  10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-03-545Z.log
.rw-r--r--  10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-04-132Z.log
.rw-r--r--  10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-05-382Z.log
.rw-r--r--  10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-06-432Z.log
.rw-r--r--  10k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-07-109Z.log
.rw-r--r-- 3.6k ishtmeet 22 Aug 06:12 Logtar_2023-08-22T00-42-08-138Z.log

Each file is only 10KB, which is exactly what we need. However, the last file is not 10KB because I had to exit the infinite loop caused by setInterval by pressing Ctrl + C.

We now have an actual logging library that can be used with any type of project. However, there is still something we need to take care of: providing a middleware function that can be used.

For instance, many Node.js backend frameworks, such as express, utilize the middleware pattern. In this pattern, incoming requests are passed through a series of functions (the middlewares), which utilize the request, and either reject it or forward it to the next middleware.

To use a logging library, we will print out certain information, such as the IP of the request, the query, pathname, or anything else.

We will ensure that our library is compatible with other frameworks. However, we will address this after we have built enough of our backend framework.

Read Next