Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from apache:master #57

Merged
merged 4 commits into from
Dec 25, 2024
Merged

Conversation

pull[bot]
Copy link

@pull pull bot commented Dec 25, 2024

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

pan3793 and others added 4 commits December 25, 2024 17:18
### Why are the changes needed?

Kyuubi uses log4j2 as the logging framework, while I found that the Hive SQL engine module still polls log4j 1.2 to the classpath unexpectedly, we should exclude it to avoid potential issues.

```
build/mvn dependency:tree -pl :kyuubi-hive-sql-engine_2.12
```
```
...
[INFO] +- org.apache.hive:hive-service:jar:3.1.3:provided
[INFO] |  +- org.apache.hive:hive-exec:jar:3.1.3:provided
[INFO] |  |  +- org.apache.zookeeper:zookeeper:jar:3.4.6:provided
[INFO] |  |  |  +- log4j:log4j:jar:1.2.16:provided
...
```

### How was this patch tested?

Checks `build/mvn dependency:tree | grep 'log4j:log4j:jar:1.2'` returns nothing and pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #6859 from pan3793/exclude-log4j1.

Closes #6859

287cf78 [Cheng Pan] Exclude log4j12 from hive engine module classpath

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
### Why are the changes needed?

Backport #5852 to Spark 3.3, to enhance MaxScanStrategy to include support for the datasourcev2 in Spark 3.3

### How was this patch tested?

Add some UTs

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6862 from zhaohehuhu/dev-1225.

Closes #6862

c745eda [zhaohehuhu] MaxScanStrategy supports DSv2 in Spark 3.3

Authored-by: zhaohehuhu <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
…erver

### Why are the changes needed?

It's a common use case that the user may want to send the service logs in a structured format to Kafka and then collect them into centralized log services for further analysis, fortunately, the Kyuubi used logging frameworks Log4j2 has built-in [KafkaAppender](https://logging.apache.org/log4j/2.x/manual/appenders/message-queue.html#KafkaAppender) and [JSON Template Layout](https://logging.apache.org/log4j/2.x/manual/json-template-layout.html), thus the goal could be achieved by just a few configurations.

To simplify the user setup steps, this PR adds `log4j-layout-template-json-<version>.jar` into Kyuubi binary tarball.

PS: I also plan to support sending engine bootstrap process(e.g. `spark-submit`) logs into Kafka with specific labels in the follow-up PRs.

### How was this patch tested?

Manually test.

Configuration in `$KYUUBI_HOME/conf/log4j2.xml`
```xml
<Configuration status="INFO">
  <Appenders>
    <Kafka name="kafka" topic="ecs-json-logs" syncSend="false">
      <JsonTemplateLayout>
        <EventTemplateAdditionalField key="app" value="kyuubi"/>
        <EventTemplateAdditionalField key="cluster" value="hadoop-testing"/>
        <EventTemplateAdditionalField key="host" value="${hostName}"/>
      </JsonTemplateLayout>
      <Property name="bootstrap.servers" value="kafka-1:9092,kafka-2:9092,kafka-3:9092"/>
      <Property name="compression.type" value="gzip"/>
    </Kafka>
  </Appenders>
  <Loggers>
    <Root level="INFO">
      <AppenderRef ref="kafka"/>
    </Root>
  </Loggers>
</Configuration>
```

Check that Kafka receives the expected structured logging message in the Elastic Common Schema(ECS) layout.
![Xnip2024-12-25_03-18-52](https://github.com/user-attachments/assets/e1b5853a-3800-4363-8ce4-7e78d0928c6a)

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #6861 from pan3793/structured-logging.

Closes #6861

9556da2 [Cheng Pan] Structured Logs
7dc6dda [Cheng Pan] Add log4j-layout-template-json

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
### Why are the changes needed?

For my use case, the instances are not human readable, so I prefer to return the FQDN.
<img width="1483" alt="image" src="https://github.com/user-attachments/assets/92045517-456f-4087-8a36-9e3e4bea2f1d" />

### How was this patch tested?

Integration testing.
```
(base) ➜  dist git:(prometheus_label_2) cat conf/kyuubi-defaults.conf
kyuubi.metrics.prometheus.metrics.instance.enabled=true
kyuubi.zookeeper.embedded.client.port.address=localhost
kyuubi.frontend.bind.host=localhost
```

<img width="1692" alt="image" src="https://github.com/user-attachments/assets/0b60d504-62ec-418d-880b-f8a2f00d5550" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #6864 from turboFei/prometheus_label_2.

Closes #6864

d24571c [Wang, Fei] match
6a6a511 [Wang, Fei] comments
c3046d4 [Wang, Fei] save
fb2021a [Wang, Fei] revert
4239594 [Wang, Fei] compatible
17b7007 [Wang, Fei] add instance label

Authored-by: Wang, Fei <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants