Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable serialization output / input stream customization #168

Open
timklge opened this issue Dec 2, 2024 · 8 comments · May be fixed by #169
Open

Enable serialization output / input stream customization #168

timklge opened this issue Dec 2, 2024 · 8 comments · May be fixed by #169

Comments

@timklge
Copy link

timklge commented Dec 2, 2024

We have recently considered using Kubernetes Kit for session replication in our project. To minimize changes to our Kotlin Spring Boot codebase, we would like to customize the TransientInjectableObjectOutputStream to exclude all referenced Spring beans from serialization without needing to annotate them with @Transient.

Currently, Kubernetes Kit does not support this because the SessionSerializer directly instantiates the output stream responsible for inspecting transient fields within a private method, preventing us from injecting a custom implementation.

We propose introducing a SerializationStreamFactory bean to obtain output and input stream instances, allowing us to inject our custom implementation. This custom implementation will replace all Spring beans in replaceObject. In our prototype using a forked version of Kubernetes Kit, this approach works well.

If you are open to this change, I will open a PR with the proposed modifications:
feature/serialization-stream-factory

@mcollovati
Copy link
Contributor

May I ask you why a custom implementation of TransientHandler is not enough for your needs?

@timklge
Copy link
Author

timklge commented Dec 2, 2024

With a custom TransientHandler, lambdas that capture references to unserializable beans will fail to serialize because SerializedLambdas store their captured state in an array field, so the handler can not inspect and exclude them. But maybe I'm missing something there?

@mcollovati
Copy link
Contributor

I see.
From a Vaadin point of view, lambda referenced by serializable components should be serializable themselves.
For example, if the lambda is defined in a UI component, usually you can inject the bean in the component and store it in a transient field and reference that field in the lambda.
See https://vaadin.com/docs/latest/tools/kubernetes/session-replication#unserializable-objects-in-lambdas for an example,

Anyway, perhaps there could be other good use cases for a custom TransientInjectableObjectOutputStream.

@timklge timklge linked a pull request Dec 3, 2024 that will close this issue
9 tasks
@timklge
Copy link
Author

timklge commented Dec 3, 2024

Yes, I think referencing transient fields from lambdas is also recommended in the Vaadin documentation. If the custom serialization approach doesn't work, we'll try that instead.

Additionally, a custom stream factory can be useful for passing another output stream to the ObjectOutputStream, such as a zstd compression stream. I believe Redis lacks built-in compression, and Hazelcast also does not support compression unless Hazelcast handles the serialization itself.

@jorgheymans
Copy link

The compression part could be interesting for us, as some of our pages add up to 4MB of data being serialized to the session. Is the compression overhead measurable or something not to worry about at all ?

@timklge
Copy link
Author

timklge commented Dec 3, 2024

I believe that the additional time spent compressing sessions with zstd is well spent. In our project, session sizes range from 1 to 2 MB, and serialization itself takes about 50 to 100 ms on my system. I tested two approaches:

  • Using the gzip implementation included in the JDK, which takes approximately 50 ms and compresses sessions to about 350 KB.
  • Using the zstd-jni library, which takes less than 10 ms and compresses sessions to about 300 KB, but introduces an additional third-party dependency and includes native code via JNI.

@timklge
Copy link
Author

timklge commented Dec 3, 2024

With the PR merged, the following would suffice to add session compression to your project:

@Component
class CustomStreamFactory : SerializationStreamFactory {
    override fun createOutputStream(
        baseOutputStream: OutputStream,
        transientHandler: TransientHandler,
        injectableFilter: Predicate<Class<*>>
    ): SerializationOutputStream {
        val zstdOutputStream = ZstdOutputStream(baseOutputStream)

        return TransientInjectableObjectOutputStream.newInstance(zstdOutputStream, transientHandler, injectableFilter)
    }

    override fun createInputStream(`in`: InputStream, transientHandler: TransientHandler): SerializationInputStream {
        val zstdInputStream = ZstdInputStream(`in`)

        return TransientInjectableObjectInputStream(zstdInputStream, transientHandler)
    }
}

@timklge
Copy link
Author

timklge commented Dec 4, 2024

Another use-case for us is to add a TimeMeasuringStream that sits between the zstdOutputStream and the target base array output stream. It measures the time between its instantiation and closure and records that to a micrometer timer. There are likely better ways to monitor the serialization runtime, but with the PR this is a simple approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants