Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gzip compression on reaper report #242

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,14 @@ import okhttp3.Request
import okhttp3.RequestBody.Companion.toRequestBody
import okhttp3.Response
import okhttp3.internal.closeQuietly
import java.io.ByteArrayOutputStream
import java.io.DataInputStream
import java.io.EOFException
import java.io.File
import java.io.FileInputStream
import java.io.FileOutputStream
import java.io.IOException
import java.util.zip.GZIPOutputStream
import kotlin.coroutines.resumeWithException

internal class ReaperReportUploadWorker(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought sorting the records might help with compression, and it does, but not as much as I had hoped:

$ # Without sorting:
$ python3 -c 'import json; import random; import base64; print(json.dumps([base64.b64encode(random.randbytes(8)).decode("utf8") for i in range(100000)]))' | gzip --verbose >/dev/null
 41.7%
$ # With sorting:
$ python3 -c 'import json; import random; import base64; print(json.dumps(sorted([base64.b64encode(random.randbytes(8)).decode("utf8") for i in range(100000)])))' | gzip --verbose >/dev/null
 46.3%

(Number is compression ratio)

So maybe not worth it.

Expand Down Expand Up @@ -175,11 +177,14 @@ internal class ReaperReportUploadWorker(
stream.write(reportString.encodeToByteArray())
}

val compressedReport = gzipCompressReport(reportString)

val url = "$baseUrl/report"
val request = Request.Builder().apply {
header("Authorization", "Bearer $apiKey")
header("Content-Encoding", "gzip")
url(url)
post(reportString.toRequestBody("application/json; charset=utf-8".toMediaTypeOrNull()))
post(compressedReport.toRequestBody("application/json; charset=utf-8".toMediaTypeOrNull()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think okhttp has a method for this: https://square.github.io/okhttp/5.x/okhttp/okhttp3/-request-body/-companion/gzip.html

Something like:

val body = reportString.toRequestBody("application/json; charset=utf-8".toMediaTypeOrNull());
val request = Request.Builder().apply {
  header("Authorization", "Bearer $apiKey")
  header("Content-Encoding", "gzip")
  url(url)
  post(body.gzip())
};

I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears to only be in the 5.0 snapshot, which we're on 4.11. I'll see if 4.x has one, as I'd prefer to not leverage a snapshot release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up, did some reading and despite 5.x-snapshot state, it sounds like it actually is quite stable/no breaking changes.

We need to land backend support for this, but I can follow up with updating OkHTTP to 5.x and leverage the helper function.

}.build()

client.newCall(request).executeAsync().use { response ->
Expand All @@ -196,6 +201,14 @@ internal class ReaperReportUploadWorker(
}
}

private fun gzipCompressReport(reportString: String): ByteArray {
val byteArrayOutputStream = ByteArrayOutputStream()
GZIPOutputStream(byteArrayOutputStream).use { gzip ->
gzip.write(reportString.toByteArray())
}
return byteArrayOutputStream.toByteArray()
}

@OptIn(ExperimentalCoroutinesApi::class)
private suspend fun Call.executeAsync(): Response = suspendCancellableCoroutine { continuation ->
continuation.invokeOnCancellation {
Expand Down
Loading