Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes apache http libraries and replace with azure core httpclient #335

Merged
merged 55 commits into from
Mar 24, 2024

Conversation

The-Funk
Copy link
Contributor

@The-Funk The-Funk commented Dec 10, 2023

This PR is ready for review. Updated Notes 3/2/2024.

Notes:

  • The removal of the entity stuff was difficult as expected. There was a lot of use of Entity and a lot of use of the URIBuilder from Apache core. I simplified this to some extent and I'd like to open a new PR in the future to remove the usage of URIBuilder completely.

  • The old Apache Client comes with an exception called ParseException that was being used to indicate parsing failures. The version of said exception from the V4 client library, extends RuntimeException. To replace it I created a KustoParseException that extends RuntimeException for the sake of compatibility. This should work as a drop in replacement. After refactoring, this class is only used in TESTS now.

  • Rather than using the Netty HttpClient implementation from core, I used the HttpClient interface itself. This way, if someone wants to replace the Netty client they can do so easily by adding a different classname to their GlobalConfig or excluding the Netty client jar from their dependencies and replacing it with their own jar. For instance, I have a personal interest in utilizing the azure-core-http-vertx client once it comes out of Beta. The vertx client better supports native image building with GraalVM. The Netty client implementation is the default implementation included with azure core. We should avoid using anything with Netty specifically in the name in order to avoid becoming locked to the Netty client.

  • Moved all request building logic (some from HttpPostUtils and some from ClientImpl) to a request builder class in order to provide better testability and separation of concerns and reduce the number of parameters that need passed between methods.

  • Moved all HTTP Request Tracing parameters to a specific builder class for better testability, separation of concerns, etc, same as above.

  • Moved remainder of HttpPostUtils into a BaseClient class that the ClientImpl extends, removing the need to pass the HTTP client to a separate class and promoting long lived HTTP Clients.

  • I deleted the HttpResponseWrapper. Since the Apache client is removed, this class should no longer be needed.

BREAKING CHANGES:

  • In the ClientRequestProperties class the proxy option used HttpHost from the Apache libraries directly. This is exposed directly to the users of the library, so in order to remove the Apache dependencies I needed to remove HttpHost. I replaced this with the ProxyOptions from Azure core but this would be a breaking change for users of the library that specify a custom proxy. They would need to rework a few lines of code, no more than 3 or 4 max.
  • HttpClientFactory now accepts any client implementing azure-core's HttpClient interface instead of an Apache CloseableHttpClient.

In Tests:

  • The UtilitiesTest in the data module made use of BasicHttpResponse from the Apache libraries. Because the HTTPResponse class from Azure core is an abstract class, creating new instances of it manually is not simple. This is probably by design. But in the event that we want to create a fake HTTPResponse, for instance in tests, we need a way to do so. There is a utility class in Tests now that will allow for the creation of fake HttpResponses. I created a simple implementation for it including only what had been used previously, and a builder to make it easier to instantiate new instances.

  • CloudInfo tests now use the RequestBuilder

@ohadbitt
Copy link
Collaborator

ohadbitt commented Dec 13, 2023

Hey
Sounds like a very well thought process really like the usage of HttpClient - looks like a great fit.

  • Reactor is pretty much the same as mutiny (not familiar with the latter but it looks pretty similar approach ) but i could help with overcoming any problems you got trying to use it - you intend that we overcome the Todos in this PR ? As my intentions were that the Async part will be handled in the next PR and now only have simple sync impl - please tell me which part would you like me to take in this PR i would do it first thing this week or the next.
  • ProxyOptions is a very good implementation that we want anyway.
  • Regarding closeable - IngestClient has to be closeable, and only the STreamingIngestClient will have it as a noOp - implemented this part now while looking at the code

@The-Funk
Copy link
Contributor Author

@ohadbitt

You're right about the closeable.

I like your plan of a separate PR to work on the async implementation. That's a lot more changes.

There's also one or two other design things that I think might make sense while we're performing surgery around the HTTP Clients. I think it would be nice to get rid of HttpPostUtils and move the post methods onto an abstract class that all of the KustoClients should extend. That way the enclosed HTTP client can be a protected property of the parent abstract class, as well as the post methods. That would mean less passing around of the HTTP client, meaning shorter method signatures. I saw there was also a comment about making the base URL string a class property that can be set one time upon KustoClient creation and then reused. I liked that idea as well.

@ohadbitt
Copy link
Collaborator

@The-Funk Sounds good - maybe better @AsafMah will look as well at all the planning

@AsafMah
Copy link
Contributor

AsafMah commented Dec 14, 2023

I'm not sure about KustoParseException extending RuntimeException, don't we want our exceptions to be checked?

@ohadbitt
Copy link
Collaborator

I too am against RuntimeExceptions - @The-Funk , sounds like you should go ahead with the other changes, when ready hit the "ready for review"

@The-Funk
Copy link
Contributor Author

The-Funk commented Dec 14, 2023

Speaking of unchecked exceptions, I had extra throws for that exception that I didn't need, I just pushed that.

In this case, I think KustoParseException made sense because the original ParseException that needed replaced was unchecked, so to get the same behavior without changing the method signatures, I replaced it with a new unchecked exception.

If you would like I could replace it with the new Apache HTTP core's ParseException which is a checked exception in V5.

Setting ready for review now.

@The-Funk The-Funk marked this pull request as ready for review December 14, 2023 13:49
Copy link
Collaborator

@ohadbitt ohadbitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through it all. Main issue is testing this properly, we need some long query, big data, multi threaded stress tests to really know we are ok here as we never had such a change. Will you take this as well ?

if (errorFromResponse.isEmpty()) {
errorFromResponse = response.getStatusLine().getReasonPhrase();
// Fixme: Missing reason phrase to add to exception. Potentially want to use an enum.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be part of the response - if its not i don't think we have to much to add

@@ -67,7 +65,7 @@ public class ResourceManager implements Closeable, IngestionResourceManager {
private IngestionResource<QueueWithSas> successfulIngestionsQueues;
private IngestionResource<QueueWithSas> failedIngestionsQueues;

ResourceManager(Client client, long defaultRefreshTime, long refreshTimeOnFailure, @Nullable CloseableHttpClient httpClient) {
ResourceManager(Client client, long defaultRefreshTime, long refreshTimeOnFailure, @Nullable HttpClient httpClient) {
this.defaultRefreshTime = defaultRefreshTime;
this.refreshTimeOnFailure = refreshTimeOnFailure;
this.client = client;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need the HttpClientWrapper(httpClient) here :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ohadbitt you want me to use the HttpClientWrapper here?

-Adds custom HTTP status codes
-Integrates status codes
-Uses Context.NONE static factory
@The-Funk
Copy link
Contributor Author

The-Funk commented Dec 22, 2023

@ohadbitt For the const values, I created a specific class called HttpStatus in the http package and put const values in it.

The reason is, the other status codes constants come from the Apache library, and if we want to remove even the core library eventually, we'll need to migrate away from the Apache codes.

Those same Apache status codes are provided as a transitive dependency of another dependency in the project (nimbus) but it probably wouldn't be a good idea to rely on them being there since we wouldn't be including the codes directly anymore if we get rid of Apache HttpCore.

…ects in QueuedIngestClientImpl per review comments
@The-Funk
Copy link
Contributor Author

The-Funk commented Dec 22, 2023

Went through it all. Main issue is testing this properly, we need some long query, big data, multi threaded stress tests to really know we are ok here as we never had such a change. Will you take this as well ?

@ohadbitt

I have a development cluster that I can run some long running queries against. I can also try some test ingestions as well. I usually ingest data as JSON though. The part that worries me is ingestions involving file sources and streams.

For scale I can use JMeter and compare/contrast my service RAM/CPU with the current version of the library.

I don't know if I can give you really good in-depth insights though. That might be beyond me.

removes Entity references
@The-Funk
Copy link
Contributor Author

The-Funk commented Mar 2, 2024

@AsafMah I think I addressed most points. All tests that I can run appear to be passing. I have not loaded in OkHttp but if you would like me to remove Netty and add OkHttp now, I can.

@The-Funk The-Funk requested a review from AsafMah March 2, 2024 02:08
Copy link
Collaborator

@ohadbitt ohadbitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets go :)
@The-Funk ill make them pass?

Copy link
Contributor

@AsafMah AsafMah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's resolve these and finally close this.

@The-Funk The-Funk requested review from AsafMah and ohadbitt March 14, 2024 00:43
@The-Funk
Copy link
Contributor Author

FYI @AsafMah @ohadbitt I fully expect that with this big of a change there will come challenges and potentially regressions. I am willing to help if issues arise.

Copy link
Contributor

@AsafMah AsafMah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience, approved!

@AsafMah AsafMah merged commit 0057e8f into Azure:master Mar 24, 2024
3 of 7 checks passed
@AsafMah
Copy link
Contributor

AsafMah commented Mar 24, 2024

@The-Funk

What's next from here?

Are you going to work on the async api?

@ohadbitt
Copy link
Collaborator

Thank you @The-Funk for your contribution,
As Asaf mentioned- you did want to impl the async api as well, if its not relevant anymore do tell :)

@The-Funk
Copy link
Contributor Author

@The-Funk

What's next from here?

Are you going to work on the async api?

Indeed!

AsafMah added a commit that referenced this pull request Jun 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants