Skip to content

Commit

Permalink
[KYUUBI apache#6861] Configuration guide of structured logging for Ky…
Browse files Browse the repository at this point in the history
…uubi server

### Why are the changes needed?

It's a common use case that the user may want to send the service logs in a structured format to Kafka and then collect them into centralized log services for further analysis, fortunately, the Kyuubi used logging frameworks Log4j2 has built-in [KafkaAppender](https://logging.apache.org/log4j/2.x/manual/appenders/message-queue.html#KafkaAppender) and [JSON Template Layout](https://logging.apache.org/log4j/2.x/manual/json-template-layout.html), thus the goal could be achieved by just a few configurations.

To simplify the user setup steps, this PR adds `log4j-layout-template-json-<version>.jar` into Kyuubi binary tarball.

PS: I also plan to support sending engine bootstrap process(e.g. `spark-submit`) logs into Kafka with specific labels in the follow-up PRs.

### How was this patch tested?

Manually test.

Configuration in `$KYUUBI_HOME/conf/log4j2.xml`
```xml
<Configuration status="INFO">
  <Appenders>
    <Kafka name="kafka" topic="ecs-json-logs" syncSend="false">
      <JsonTemplateLayout>
        <EventTemplateAdditionalField key="app" value="kyuubi"/>
        <EventTemplateAdditionalField key="cluster" value="hadoop-testing"/>
        <EventTemplateAdditionalField key="host" value="${hostName}"/>
      </JsonTemplateLayout>
      <Property name="bootstrap.servers" value="kafka-1:9092,kafka-2:9092,kafka-3:9092"/>
      <Property name="compression.type" value="gzip"/>
    </Kafka>
  </Appenders>
  <Loggers>
    <Root level="INFO">
      <AppenderRef ref="kafka"/>
    </Root>
  </Loggers>
</Configuration>
```

Check that Kafka receives the expected structured logging message in the Elastic Common Schema(ECS) layout.
![Xnip2024-12-25_03-18-52](https://github.com/user-attachments/assets/e1b5853a-3800-4363-8ce4-7e78d0928c6a)

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#6861 from pan3793/structured-logging.

Closes apache#6861

9556da2 [Cheng Pan] Structured Logs
7dc6dda [Cheng Pan] Add log4j-layout-template-json

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
  • Loading branch information
pan3793 committed Dec 25, 2024
1 parent 117e56c commit 14e12e9
Show file tree
Hide file tree
Showing 6 changed files with 73 additions and 1 deletion.
1 change: 1 addition & 0 deletions LICENSE-binary
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,7 @@ org.eclipse.jetty:jetty-proxy
org.apache.logging.log4j:log4j-1.2-api
org.apache.logging.log4j:log4j-api
org.apache.logging.log4j:log4j-core
org.apache.logging.log4j:log4j-layout-template-json
org.apache.logging.log4j:log4j-slf4j-impl
org.yaml:snakeyaml
io.dropwizard.metrics:metrics-core
Expand Down
13 changes: 12 additions & 1 deletion conf/log4j2.xml.template
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
~ limitations under the License.
-->

<!-- Provide log4j2.xml.template to fix `ERROR Filters contains invalid attributes "onMatch", "onMismatch"`, see KYUUBI-2247 -->
<!-- Provide log4j2.xml.template to fix `ERROR Filters contains invalid attributes "onMatch", "onMismatch"`, see KYUUBI #2247 -->
<!-- Extra logging related to initialization of Log4j.
Set to debug or trace if log4j initialization is failing. -->
<Configuration status="INFO">
Expand Down Expand Up @@ -57,6 +57,17 @@
</Policies>
<DefaultRolloverStrategy max="10"/>
</RollingFile>
<!-- Kafka appender with Elastic Common Schema(ECS) JSON template layout
<Kafka name="kafka" topic="ecs-json-logs" syncSend="false">
<JsonTemplateLayout>
<EventTemplateAdditionalField key="app" value="kyuubi"/>
<EventTemplateAdditionalField key="cluster" value="kyuubi-cluster"/>
<EventTemplateAdditionalField key="host" value="${hostName}"/>
</JsonTemplateLayout>
<Property name="bootstrap.servers" value="kafka-1:9092,kafka-2:9092,kafka-3:9092"/>
<Property name="compression.type" value="gzip"/>
</Kafka>
-->
</Appenders>
<Loggers>
<Root level="INFO">
Expand Down
1 change: 1 addition & 0 deletions dev/dependencyList
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ kubernetes-model-storageclass/6.13.1//kubernetes-model-storageclass-6.13.1.jar
log4j-1.2-api/2.24.2//log4j-1.2-api-2.24.2.jar
log4j-api/2.24.2//log4j-api-2.24.2.jar
log4j-core/2.24.2//log4j-core-2.24.2.jar
log4j-layout-template-json/2.24.2//log4j-layout-template-json-2.24.2.jar
log4j-slf4j-impl/2.24.2//log4j-slf4j-impl-2.24.2.jar
logging-interceptor/3.12.12//logging-interceptor-3.12.12.jar
metrics-core/4.2.26//metrics-core-4.2.26.jar
Expand Down
49 changes: 49 additions & 0 deletions docs/monitor/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,55 @@ For example, we can disable the console appender and enable the file appender li

Then everything goes to `log/dummy.log`.

#### Sending Structured Logs to Kafka

The Log4j2 has a built-in [KafkaAppender](https://logging.apache.org/log4j/2.x/manual/appenders/message-queue.html#KafkaAppender)
which allows sending log messages to an Apache Kafka topic with a few configurations, and it also provides a built-in
[JSON Template Layout](https://logging.apache.org/log4j/2.x/manual/json-template-layout.html) that supports encoding
`LogEvents` to structured JSON messages according to the structure described by the provided template.

For example, we can configure the Kyuubi server to send the structured logs to Kafka `ecs-json-logs` topic,

```xml
<Configuration status="INFO">
<Appenders>
<Kafka name="kafka" topic="ecs-json-logs" syncSend="false">
<JsonTemplateLayout>
<EventTemplateAdditionalField key="app" value="kyuubi"/>
<EventTemplateAdditionalField key="cluster" value="kyuubi-cluster"/>
<EventTemplateAdditionalField key="host" value="${hostName}"/>
</JsonTemplateLayout>
<Property name="bootstrap.servers" value="kafka-1:9092,kafka-2:9092,kafka-3:9092"/>
<Property name="compression.type" value="gzip"/>
</Kafka>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="kafka"/>
</Root>
</Loggers>
</Configuration>
```

And each structured log message looks like,

```json
{
"@timestamp": "2024-12-24T18:53:01.030Z",
"ecs.version": "1.2.0",
"log.level": "INFO",
"message": "Service[KyuubiServer] is started.",
"process.thread.name": "main",
"log.logger": "org.apache.kyuubi.server.KyuubiServer",
"app": "kyuubi",
"cluster": "kyuubi-cluster",
"host": "hadoop-master1.orb.local"
}
```

Note: this feature may require additional jars to work. Please read the Log4j2 docs and ensure those jars are
on the Kyuubi server's classpath before enabling it.

## Logs of Spark SQL Engine

Spark SQL Engine is one type of Kyuubi Engines and also a typical Spark application.
Expand Down
5 changes: 5 additions & 0 deletions kyuubi-assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -119,5 +119,10 @@
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-1.2-api</artifactId>
</dependency>

<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-layout-template-json</artifactId>
</dependency>
</dependencies>
</project>
5 changes: 5 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -663,6 +663,11 @@
<artifactId>log4j-1.2-api</artifactId>
<version>${log4j.version}</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-layout-template-json</artifactId>
<version>${log4j.version}</version>
</dependency>

<dependency>
<groupId>io.dropwizard.metrics</groupId>
Expand Down

0 comments on commit 14e12e9

Please sign in to comment.