Recommandation about logging apps #368

anthonycorbacho · 2022-05-06T08:34:00Z

anthonycorbacho
May 6, 2022

Hello 👋

I really enjoy referencing this application and learning a lot about how your define layers (handler, domain, dependency).
One question remained and I would love to get your opinion about logging.

I saw there is simple logs in the wtf application (HTTP package and other places) but I don't see much definition or guideline regarding the actual business logic (root + dependencies layer).
Do you have any recommendations? or maybe a blog post that I have missed from https://www.gobeyond.dev ?

Thank you.

benbjohnson · 2022-05-06T13:47:43Z

benbjohnson
May 6, 2022
Maintainer

Hi @anthonycorbacho. Good question. Logging is an area where I still don't feel like I have great rules for what should or shouldn't be done. I'll try to give you a rough cut of where I stand though.

I generally stick with the global log package. I've tried JSON-formatted logs and scoped logs and zap loggers but they're usually overkill. I'm not typically parsing my logs—I'm just looking at them when something goes wrong.

I've previously just logged when things go wrong and used metrics to track progress for things that go right (and wrong actually). However, it's a lot easier to just peek at logs so I've started logging when things are successful too. For example, Litestream previously just logged errors but recently I changed it so Litestream logs whenever a WAL segment is uploaded. It lets users see that Litestream is actually doing something and it helps track down issues later since I have a better timeline.

I feel like it mostly comes down to trial-and-error. Add logging when you think it'll be useful and remove it when it's not. Logging isn't for detecting issues so it's not a big deal if you have a lot of logs. If you need to detect or report issues then you should use metrics for that.

I hope that helped. Let me know if you have any specific questions about it and I'll try to follow up.

0 replies

anthonycorbacho · 2022-05-09T00:55:01Z

anthonycorbacho
May 9, 2022
Author

Hi @benbjohnson

Thank you for your nice answer, I am in the same pickle, I am using telemetry (tracing, metrics, and log).
While those 2 are pretty simple to "standardize", I found out there is no common way to define a useful way to log in to our system.

I think the way I am going to experiment with log is to be able to tell a story about the endpoint and the call stack the request is going through and tied it with a trace id (from opentelemetry). Not sure if its the best way, but I think its should give enough useful context to help understand what is going on. Combined with trace<>log of cource.

0 replies

anthonycorbacho · 2022-05-10T05:00:31Z

anthonycorbacho
May 10, 2022
Author

Not sure if it will be useful, but this is the guideline I am defining for my team

Of all telemetry signals, logs have probably the biggest legacy. Most programming languages have built-in logging capabilities or well-known, widely used logging libraries.

While there is no standard or community guideline to define how and when to log into our application, it is sometimes quite challenging to decide what to log.

The main purpose of the application log is to give enough signals and context to tell the complete story about a user request in our system.

This is in essence the philosophy behind an application log; give enough context of what happens in a meaningful way not only to a specific service but globally via a trace context.

Why we log

A log is a collection of messages. Often written to disk, sometimes streamed somewhere else before getting written to disk. Usually line-delimited. Usually, come with a timestamp attached. Individual messages may or may not be related to each other. The content of the message is represented using a structured format of keys and values instead of natural language. It may be structured in a variety of ways: positional fields, JSON, key=val, key:val, bson, etc., etc.

Logs, at their core, are simply a way to tell a story about a structured (contains information) unit of work in the application at a given time.

What is a unit of work?

Treating a log as a unit of work lets you adjust what it means depending on the goals of the observer. Sometimes a unit of work is accepting an HTTP request and doing everything necessary to hand back a response, but sometimes one HTTP request can generate many events/logs. It is a fluid relationship, zooming in on troubled areas, and back out again to understand the overall behavior of the entire service. It should record the input necessary to perform the work, attributes computed or resolved or discovered along the way, the conditions of the service as it was performing the work, and finally some details on the result of that work. Most of the time we can't include all that all the time for everything because it is usually not in the interests of the business to maintain copies of all that data.

{
  "Severity":"Info",
  "Timestamp":1651792770814931000,
  "Body":"getting user information",
  "Attributes":{
    "user.id": "users/123456"
    "request.path": "/v1/users",
    "request.method": "GET"
  },
  "TraceId": "564c51f4a2726f1dd15c44cd9119668e",
  "SpanId": "6fcecffb4352e673"
}

Who consumes our Logs

Unlike errors, logs or event logs have a defined consumer role – The operator.

In this role, you typically want to see as much information as possible about the unit of work. This will help the operator to have the logs correlated with traces and add support for distributed context log propagation that will help to understand the story flow of the user request and enable the operator to perform a precise log search or trace search and vice versa.

The operator will likely use LogQL to query logs.

Working with log effectively

Our Go project layout is generally composed of 3 distinct components

Handler (HTTP, GRPC)
Root (business domain)
Dependency (database, 3rd party client)

Each layer has a different responsibility and logs should be handled differently and to some extent, depending on the layer and the case by case situation can be avoided or enforced.

Log at Handler layer

The handler layer is the entry point of the service, this is where we will initially start the unit of work of the user request.

It is recommended to add a logger Info or Debug (in case the endpoint is too noisy) at the beginning of the function and log any error that might happen.

func (h *myGrpcHandler) Fetch(ctx context.Context, req v1.Request) (*v1.Response, error) {
  // ... perform validation and setup tracing
  
  // Initial entry point log anouncing that we are
  // starting the user request
  h.log.Info(
    ctx,
    "fetching data",
    log.Int("id", req.ID),
  )
  
  d, err := h.myService.Fetch(ctx, req.ID)
  if err != nil {
    // Signaling that an error happend to this scope call
    h.log.Error(
      ctx,
      "fetching data",
      log.Err(err),
      log.Int("id", req.ID),
    )
  
    // defined error
    if errors.Is(err, myapp.ErrNotFound) {
      return nil,
        errors.Status(
          codes.NotFound,
          "couldnt get data because not found",
          &errdetails.ErrorInfo{
			Reason: "NOT_FOUND",
			Metadata: map[string]string{
				"id":   string(req.ID),
			},
		})
    }
    // Undefined error
    return nil,
        errors.Status(
          codes.Internal,
          errors.Unwrap(err),
          &errdetails.ErrorInfo{
			Reason: "SERVER_ERROR",
			Metadata: map[string]string{
				"id":   string(req.ID),
			},
		})
  }

  // ...
}

Log at Root layer

In our application layout, the Root layer is the middle man that is constructing request-response and owns the business logic.

Logging at this layer is optional and depends on the additional information that can be provided from this layer.

func (srv *MyService) Fetch(ctx context.Context, id string) (&Struct, error) {
  // ... perform validation and setup tracing
  
  // <Optional> We can add a log Debug in case we want to have the
  // capability to enable the debuging.
  // srv.log.Debug(ctx, "Fetch: fetching data", log.Int("user.id", id))
  
  d, err := svc.client.Fetch(ctx, id)
  if err != nil {
    // loggin here is not provided because not additional information
    // is required upon error since the error will already provide context
    return nil, errors.Wrap(err, "srv.fetch")
  }
  
  // Here loggin might be useful because we are adding more context
  srv.log.Info(
    ctx,
    "Fetch: fetching data metadata",
    log.Int("metadata.id", d.Id),
    log.Int("user.id", id),
  )
  d2, err := svc.c2.FetchMetadata(ctx, d.Id, id)
  if err != nil {
    return nil, errors.Wrap(err, "myservice.fetch")
  }
  
  // ...
  
}

Log at Dependency layer

In our application layout, this will be the lower level, this is where we will perform a call to an external resource (database, 3rd party, or application client).
Therefore, logging at this level is very important because it will give the operator more context for what is happening in the unit of work.

func (c *Client) Fetch(ctx context.Context, id string) (&MyData, error) {
  // <Optional> We can add a log Debug in case we want to have the
  // capability to enable the debuging.
  // c.log.Debug(ctx, "Fetch: fetching data from database XYZ", log.Int("user.id", id))
  
  d, err := c.Select(ctx, id)
  if err != nil {
    // Error happen whil calling external service.
    // We log the error so we enforce the colleration between log and tace
    // at this level that will give us an easy way to jump between trace <> log.
    c.log.Error(
      ctx,
      "Fetch: fail fetching data from database XYZ",
      log.Err(err),
      log.Int("user.id", id),
    )
  
    // case of busines define error
    if err = sql.ErrNoRows {
      return nil, myapp.ErrNotFound
    }
    // case of undefined error 
     return nil, errors.Warp(error, "sql.client.fetch")
  }
  
  return convert(d), nil
}

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommandation about logging apps #368

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Recommandation about logging apps #368

anthonycorbacho May 6, 2022

Replies: 3 comments

benbjohnson May 6, 2022 Maintainer

anthonycorbacho May 9, 2022 Author

anthonycorbacho May 10, 2022 Author

Why we log

What is a unit of work?

Who consumes our Logs

Working with log effectively

Log at Handler layer

Log at Root layer

Log at Dependency layer

anthonycorbacho
May 6, 2022

benbjohnson
May 6, 2022
Maintainer

anthonycorbacho
May 9, 2022
Author

anthonycorbacho
May 10, 2022
Author