Recommandation about logging apps #368
Replies: 3 comments
-
Hi @anthonycorbacho. Good question. Logging is an area where I still don't feel like I have great rules for what should or shouldn't be done. I'll try to give you a rough cut of where I stand though. I generally stick with the global I've previously just logged when things go wrong and used metrics to track progress for things that go right (and wrong actually). However, it's a lot easier to just peek at logs so I've started logging when things are successful too. For example, Litestream previously just logged errors but recently I changed it so Litestream logs whenever a WAL segment is uploaded. It lets users see that Litestream is actually doing something and it helps track down issues later since I have a better timeline. I feel like it mostly comes down to trial-and-error. Add logging when you think it'll be useful and remove it when it's not. Logging isn't for detecting issues so it's not a big deal if you have a lot of logs. If you need to detect or report issues then you should use metrics for that. I hope that helped. Let me know if you have any specific questions about it and I'll try to follow up. |
Beta Was this translation helpful? Give feedback.
-
Hi @benbjohnson Thank you for your nice answer, I am in the same pickle, I am using telemetry (tracing, metrics, and log). I think the way I am going to experiment with log is to be able to tell a story about the endpoint and the call stack the request is going through and tied it with a trace id (from opentelemetry). Not sure if its the best way, but I think its should give enough useful context to help understand what is going on. Combined with trace<>log of cource. |
Beta Was this translation helpful? Give feedback.
-
Not sure if it will be useful, but this is the guideline I am defining for my team Of all telemetry signals, logs have probably the biggest legacy. Most programming languages have built-in logging capabilities or well-known, widely used logging libraries. While there is no standard or community guideline to define how and when to log into our application, it is sometimes quite challenging to decide what to log. The main purpose of the application log is to give enough signals and context to tell the complete story about a user request in our system. This is in essence the philosophy behind an application log; give enough context of what happens in a meaningful way not only to a specific service but globally via a trace context. Why we logA log is a collection of messages. Often written to disk, sometimes streamed somewhere else before getting written to disk. Usually line-delimited. Usually, come with a timestamp attached. Individual messages may or may not be related to each other. The content of the message is represented using a structured format of keys and values instead of natural language. It may be structured in a variety of ways: positional fields, JSON, key=val, key:val, bson, etc., etc. Logs, at their core, are simply a way to tell a story about a structured (contains information) unit of work in the application at a given time. What is a unit of work?Treating a log as a unit of work lets you adjust what it means depending on the goals of the observer. Sometimes a unit of work is accepting an HTTP request and doing everything necessary to hand back a response, but sometimes one HTTP request can generate many events/logs. It is a fluid relationship, zooming in on troubled areas, and back out again to understand the overall behavior of the entire service. It should record the input necessary to perform the work, attributes computed or resolved or discovered along the way, the conditions of the service as it was performing the work, and finally some details on the result of that work. Most of the time we can't include all that all the time for everything because it is usually not in the interests of the business to maintain copies of all that data.
Who consumes our LogsUnlike errors, logs or event logs have a defined consumer role – The operator. In this role, you typically want to see as much information as possible about the unit of work. This will help the operator to have the logs correlated with traces and add support for distributed context log propagation that will help to understand the story flow of the user request and enable the operator to perform a precise log search or trace search and vice versa. The operator will likely use LogQL to query logs. Working with log effectivelyOur Go project layout is generally composed of 3 distinct components
Each layer has a different responsibility and logs should be handled differently and to some extent, depending on the layer and the case by case situation can be avoided or enforced. Log at Handler layerThe handler layer is the entry point of the service, this is where we will initially start the unit of work of the user request. It is recommended to add a logger Info or Debug (in case the endpoint is too noisy) at the beginning of the function and log any error that might happen. func (h *myGrpcHandler) Fetch(ctx context.Context, req v1.Request) (*v1.Response, error) {
// ... perform validation and setup tracing
// Initial entry point log anouncing that we are
// starting the user request
h.log.Info(
ctx,
"fetching data",
log.Int("id", req.ID),
)
d, err := h.myService.Fetch(ctx, req.ID)
if err != nil {
// Signaling that an error happend to this scope call
h.log.Error(
ctx,
"fetching data",
log.Err(err),
log.Int("id", req.ID),
)
// defined error
if errors.Is(err, myapp.ErrNotFound) {
return nil,
errors.Status(
codes.NotFound,
"couldnt get data because not found",
&errdetails.ErrorInfo{
Reason: "NOT_FOUND",
Metadata: map[string]string{
"id": string(req.ID),
},
})
}
// Undefined error
return nil,
errors.Status(
codes.Internal,
errors.Unwrap(err),
&errdetails.ErrorInfo{
Reason: "SERVER_ERROR",
Metadata: map[string]string{
"id": string(req.ID),
},
})
}
// ...
} Log at Root layerIn our application layout, the Root layer is the middle man that is constructing request-response and owns the business logic. Logging at this layer is optional and depends on the additional information that can be provided from this layer. func (srv *MyService) Fetch(ctx context.Context, id string) (&Struct, error) {
// ... perform validation and setup tracing
// <Optional> We can add a log Debug in case we want to have the
// capability to enable the debuging.
// srv.log.Debug(ctx, "Fetch: fetching data", log.Int("user.id", id))
d, err := svc.client.Fetch(ctx, id)
if err != nil {
// loggin here is not provided because not additional information
// is required upon error since the error will already provide context
return nil, errors.Wrap(err, "srv.fetch")
}
// Here loggin might be useful because we are adding more context
srv.log.Info(
ctx,
"Fetch: fetching data metadata",
log.Int("metadata.id", d.Id),
log.Int("user.id", id),
)
d2, err := svc.c2.FetchMetadata(ctx, d.Id, id)
if err != nil {
return nil, errors.Wrap(err, "myservice.fetch")
}
// ...
} Log at Dependency layerIn our application layout, this will be the lower level, this is where we will perform a call to an external resource (database, 3rd party, or application client). func (c *Client) Fetch(ctx context.Context, id string) (&MyData, error) {
// <Optional> We can add a log Debug in case we want to have the
// capability to enable the debuging.
// c.log.Debug(ctx, "Fetch: fetching data from database XYZ", log.Int("user.id", id))
d, err := c.Select(ctx, id)
if err != nil {
// Error happen whil calling external service.
// We log the error so we enforce the colleration between log and tace
// at this level that will give us an easy way to jump between trace <> log.
c.log.Error(
ctx,
"Fetch: fail fetching data from database XYZ",
log.Err(err),
log.Int("user.id", id),
)
// case of busines define error
if err = sql.ErrNoRows {
return nil, myapp.ErrNotFound
}
// case of undefined error
return nil, errors.Warp(error, "sql.client.fetch")
}
return convert(d), nil
} |
Beta Was this translation helpful? Give feedback.
-
Hello 👋
I really enjoy referencing this application and learning a lot about how your define layers (handler, domain, dependency).
One question remained and I would love to get your opinion about logging.
I saw there is simple logs in the wtf application (HTTP package and other places) but I don't see much definition or guideline regarding the actual business logic (root + dependencies layer).
Do you have any recommendations? or maybe a blog post that I have missed from https://www.gobeyond.dev ?
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions