diff --git a/docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md b/docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md index 23b1e8ec24c7c..d8861641a25dc 100644 --- a/docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md +++ b/docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md @@ -393,7 +393,7 @@ If you observe unexpected behavior, manually specify the return type using the ` #### Serialization of POJO types The `PojoTypeInfo` is creating serializers for all the fields inside the POJO. Standard types such as -int, long, String etc. are handled by serializers we ship with Flink. +int, long, String, etc. are handled by serializers we ship with Flink. For all other types, we fall back to [Kryo](https://github.com/EsotericSoftware/kryo). If Kryo is not able to handle the type, you can ask the `PojoTypeInfo` to serialize the POJO using [Avro](https://avro.apache.org). @@ -429,6 +429,15 @@ or via user-defined custom serializers. To do that, set: pipeline.generic-types: false ``` +{{< hint warning >}} +Note that Kryo will deserialize any class on the classpath +* when `pipeline.generic-types: true` +* the Flink job has a type definition such as `DataStream`, or `DataStream>`. + +A malicious actor who knows the Flink job and controls the data input to Kryo will be able to instantiate classes which are +not intended for instantiation. +{{< /hint >}} + An exception will be raised whenever a data type is encountered that would go through Kryo. ## Defining Type Information using a Factory diff --git a/docs/content.zh/docs/ops/production_ready.md b/docs/content.zh/docs/ops/production_ready.md index bb07d506bb987..de3bd6bbdc83f 100644 --- a/docs/content.zh/docs/ops/production_ready.md +++ b/docs/content.zh/docs/ops/production_ready.md @@ -81,6 +81,10 @@ It is a single point of failure within the cluster, and if it crashes, no new jo Configuring [High Availability]({{< ref "docs/deployment/ha/overview" >}}), in conjunction with Apache Zookeeper or Flinks Kubernetes based service, allows for a swift recovery and is highly recommended for production setups. +### Harden Kryo Serialization + +[Disable support for generic Kryo types]({{< ref "docs/dev/datastream/fault-tolerance/serialization/types_serialization" >}}#disabling-kryo-fallback), as this is a security and performance concern. + ### Secure Flink Cluster Access Flink is intentionally designed to support remote code execution. diff --git a/docs/content/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md b/docs/content/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md index 532e07ab3c286..d8b439aec384e 100644 --- a/docs/content/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md +++ b/docs/content/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md @@ -394,7 +394,7 @@ If you observe unexpected behavior, manually specify the return type using the ` #### Serialization of POJO types The `PojoTypeInfo` is creating serializers for all the fields inside the POJO. Standard types such as -int, long, String etc. are handled by serializers we ship with Flink. +int, long, String, etc. are handled by serializers we ship with Flink. For all other types, we fall back to [Kryo](https://github.com/EsotericSoftware/kryo). If Kryo is not able to handle the type, you can ask the `PojoTypeInfo` to serialize the POJO using [Avro](https://avro.apache.org). @@ -430,6 +430,15 @@ or via user-defined custom serializers. To do that, set: pipeline.generic-types: false ``` +{{< hint warning >}} +Note that Kryo will deserialize any class on the classpath + * when `pipeline.generic-types: true` + * the Flink job has a type definition such as `DataStream`, or `DataStream>`. + +A malicious actor who knows the Flink job and controls the data input to Kryo will be able to instantiate classes which are +not intended for instantiation. +{{< /hint >}} + An exception will be raised whenever a data type is encountered that would go through Kryo. ## Defining Type Information using a Factory diff --git a/docs/content/docs/ops/production_ready.md b/docs/content/docs/ops/production_ready.md index c20cf6a5aed2f..24522402e5983 100644 --- a/docs/content/docs/ops/production_ready.md +++ b/docs/content/docs/ops/production_ready.md @@ -81,6 +81,10 @@ It is a single point of failure within the cluster, and if it crashes, no new jo Configuring [High Availability]({{< ref "docs/deployment/ha/overview" >}}), in conjunction with Apache Zookeeper or Flinks Kubernetes based service, allows for a swift recovery and is highly recommended for production setups. +### Harden Kryo Serialization + +[Disable support for generic Kryo types]({{< ref "docs/dev/datastream/fault-tolerance/serialization/types_serialization" >}}#disabling-kryo-fallback), as this is a security and performance concern. + ### Secure Flink Cluster Access Flink is intentionally designed to support remote code execution.