Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mime type from config field in Elasticsearch indexer while posting HTTP request #41

Merged
merged 10 commits into from
May 8, 2018

Conversation

VamanKulkarni
Copy link

## Problem:
Due to this bug in Logstash itself, build logs which are sent to Logstash over HTTP(logstash-input-http) using Elasticsearch indexer gets indexed as plain text instead of JSON.

This plugin uses ContentType.APPLICATION_JSON which sets the Content-Type header to "application/json; charset=UTF-8" and that is not recognized by Logstash and treats the json string as plain text.

## What has been done in this PR
Introduced (as per @jakub-bochenski suggestion) a new configuration field for Mime type which gives ability for users to override Mime type. It defaults to application/json. This mime type is then set as request header inElasticSeachDAO

Copy link

@jakub-bochenski jakub-bochenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add at least one unit test covering this code, other than that looks good on first look

@VamanKulkarni
Copy link
Author

Thanks @jakub-bochenski .
I came across a problem while trying to write unit test.
My commit introduces
Descriptor logstashPluginConfig = (Descriptor) Jenkins.getInstance().getDescriptor(LogstashInstallation.class);
to the ElasticSearchDao.getHttpPost() method.
Mocking Jenkins.getInstance() is not possible with mockito. This page suggests to use PowerMock for mocking static methods. However, powermock has issues with mockito > 2.0. (powermock/powermock#681).

You may be already aware of this problem. Do you have a way out?

@jakub-bochenski
Copy link

Apparently mockito can do it: #37 (review)

On second thoughts maybe it would be better to add this setting to the publisher/wrapper (per job) instead of doing this globally for the whole jenkins instance

@VamanKulkarni
Copy link
Author

@jakub-bochenski I am working on your suggestion to add mimeType setting to wrapper.
Need some inputs from you as I am pretty new to plugin development.
I declared instance variable mimeType in LogstashBuildWrapper class and able to populate it with the value specified from UI. However, I am yet to figure out how to pass on/access this variable in ElasticSearchDao so that I can set it as content type to the post request. Would it be possible for you to provide some pointers around this? Meanwhile, I will continue looking. Thanks!

@jakub-bochenski
Copy link

jakub-bochenski commented Dec 20, 2017

However, I am yet to figure out how to pass on/access this variable in ElasticSearchDao

In general we don't have a mechanism to pass backend-specific options from the job config to the dao layer.
For the record we don't have a facility to pass this from the global config either, hence e.g. you need to put the scheme and host into the host field for Elastisearch backend type.

I think for this we need to go through the LogstashWriter and extend IndexerDaoFactory.getInstance to accept backend-specific configuration settings.

I'd refactor LogstashWriter to not call IndexerDaoFactory.getInstance directly and instead accept a ready instance of the IndexerDao.

Given the scope of refactoring needed I'm OK with doing this in global config as you did originally

@VamanKulkarni
Copy link
Author

Apparently mockito can do it: #37 (review)

@jakub-bochenski May be I am missing something bigtime but, I checked #37 and it is used to mock a final method. I didn't find any examples of MockMaker for mocking static methods. If you have any pointer to such example handy let me know. :)

If mocking static method is not too straight forward then, I am thinking to overload getHttpPost with getHttpPost(String data, String mimeType) and use this overloaded method for unit testing. Feel free to suggest.

@jakub-bochenski
Copy link

@VamanKulkarni my bad.
There is another PR from @mwinter #38 and it adds powermock. Maybe he can comment in detail here, but I assume it means there is a workable solution for static mocks.

Still, if we can avoid mocking static methods by doing some refactoring I'd prefer that (if the effort to refactor is not big).

@VamanKulkarni
Copy link
Author

@jakub-bochenski I've refactored code to make it little more unit test'ble and avoided mocking static method within unit tests. Take a look at the changes when you get a chance and let me know if you have any comments.

@@ -22,6 +22,10 @@
checkUrl="'${resURL}/descriptorByName/LogstashInstallation/checkString?value='+escape(this.value)" />
</f:entry>
<f:advanced>
<f:entry title="${%MIME type}" field="mimeType">
<f:textbox value="${descriptor.mimeType}" default="application/json"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the existing installations will get null or the default value.
Can we set the default to match the previous value?

Copy link
Author

@VamanKulkarni VamanKulkarni Dec 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I understand that correctly. Even in case of null, Content-Type will be set to ContentType.APPLICATION_JSON. Default value is no different then the previous one it is just that it is set in different way :) i.e. the default value is set to application/json and the String entity is initialized with charset=UTF-8.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I get it now.
Can we make it more explicit in the description that the mimeType input doesn't accept any parameters (incl. charset)?

return FormValidation.error(Messages.ValueIsRequired());
}
// Mime type validation as per RFC-4288.
if (!value.matches("^[\\w#&+_$.\\-\\^]+/[\\w#&+_$.\\-\\^]+$")) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think of this earlier. But I did not end up using it in my code because I realized it does not actually parse Mime-Type. for e.g. ContentType.parse("application/json%%") (which is clearly not a valid mime-type) does not complain instead it returns ContentType object with this invalid mimetype.
For this reason I did not use it in the code and ended up validating by my own as per RFC-4288. This would potentially eliminate unwanted surprises that could cause with invalid mime-type.

Copy link

@jakub-bochenski jakub-bochenski Dec 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I assumed

    ParseException - if the given text does not represent a valid Content-Type value.

means it does validate it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just validates Content-Type as a whole and does not look for mime-type validity.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use javax.activation.MimeType
This does validate the mimetype. Though it seems & is a valid character in a mimetype. Mimetype fails when you use §
see RFC 2045

@jakub-bochenski
Copy link

@VamanKulkarni can you check if it would be possible to integrate this changes into the #43 refactoring?

@VamanKulkarni
Copy link
Author

@jakub-bochenski Okay. I shall take a look at integrating my changes with #43 . I am not sure about the process though. Do I take #43 changes on to my fork, then integrate and update this PR or do I update the PR #43 itself with my changes or is there any other way?

@jakub-bochenski
Copy link

@VamanKulkarni I'd suggest a new PR forking #43
It seems like a simple addition of a new property to https://github.com/jenkinsci/logstash-plugin/pull/43/files#diff-ebc0a0ba452d2cfc22ad4a3b6644508c and the related jelly.

@VamanKulkarni
Copy link
Author

@jakub-bochenski : I shall wait until changes in PR #43 are finalized and merged just to avoid any possibility of rework that could arise out of reviewing #43. :)

@jakub-bochenski
Copy link

@VamanKulkarni #43 is now merged

@VamanKulkarni
Copy link
Author

@jakub-bochenski I've updated PR with the changes as per new code refactoring. Please take a look.

@jakub-bochenski
Copy link

LGTM
@mwinter69 would you mind taking a look at those changes?

BTW sorry about not responding lately but I have some health issues. Hopefully it's nothing serious, still I won't be able to put in much work for now

@mwinter69
Copy link

@jakub-bochenski get well soon

Does it make sense to allow any mimetype other than application/json?
Couldn't we just stick to making sure that ContentType is application/json and set the Charset to UTF-8?

StringEntity input = new StringEntity(data, StandardCharsets.UTF_8); input.setContentType(ContentType.APPLICATION_JSON.getMimeType());

@VamanKulkarni
Copy link
Author

@jakub-bochenski Hope you are getting better :)
@mwinter69 My initial proposal was exactly the same. @jakub-bochenski and I had a discussion over this and decided to go with making it configurable. Please take a look at the comments section https://wiki.jenkins.io/display/JENKINS/Logstash+Plugin

@jakub-bochenski
Copy link

jakub-bochenski commented Feb 26, 2018

Thanks for your consideration :)

I admit this might violate YAGNI, however my 2 top arguments for adding this option are:

  1. this change should be opt it, preserving the previous behavior by default,
  2. if we need at least a on/off toggle because of 1. why not add an input text to make the solution more robust, it doesn't complicate the code much

String mimeType = this.getMimeType();
// char encoding is set to UTF_8 since this request posts a JSON string
StringEntity input = new StringEntity(data, StandardCharsets.UTF_8);
mimeType = (mimeType != null) ? mimeType : ContentType.APPLICATION_JSON.toString();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mimeType will never be null. If you save the global configuration with no content for the mimeType you will get an empty string. Only when you update the plugin without ever saving the configuration it will be null.

@@ -163,5 +178,17 @@ public FormValidation doCheckUrl(@QueryParameter("value") String value)
}
return FormValidation.ok();
}
public FormValidation doCheckMimeType(@QueryParameter("value") String value) {
if (StringUtils.isBlank(value)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a blank mimeType should be valid. In that case it should be set to null in the DAO so you have the old behaviour. This should be mentioned in the help.
You can also save with a blank mimeType which would result in no mimeType being configured in the StringEntity. Not sure if this will break anything,

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwinter69 IMHO, I think we should not allow user to save config with blank mimeType and it would not be of much help/value addition if we do not configure mimeType on StringEntity considering the nature of elasticsearch and logstash. I think we should set application/json by default which will make both ES and logstash happy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also fail when you try to save with an empty string? If not then you can run into problems at runtime as you will pass an empty string. Maybe use Util.fixEmptyAndTrim in the DAO to avoid this or do not allow to save with an illegal mimetype.

Copy link
Author

@VamanKulkarni VamanKulkarni Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwinter69 Apologies for delays in response. Recovering from bad health. Correct me if I mis-understand understand your question. As per current logic, user is not allowed to save an empty string for mimeType. It doesn't displays an error if user tries to save empty/invalid mimeType.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go to the configuration page and make the mimetype field empty. When you leave the field it will tell you that a value is required. Now click on apply. Congratulations you have set the mimetype to an empty string which will most likely fail.
You should throw an exception in the setter method if no valid mimetype is given.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into this it seems that Stapler is swallowing the exception and only logging when you try to use a setter method,
By initializing the mimetype in the constructor, and throwing the exception there you can avoid that it gets saved.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the inputs. I am hoping that I understood your suggestion correctly. Here is what I did. I initialized mimeType in ElasticSearchDao() constructor and if it is invalid, thrown an exception just like how its done for uri.toURL(); currently. However, I realized that ElasticSeachDao is not instantiated every time we click on "Save" button on jenkins configuration page. Hence throwing exception in side ElasticSearchDao() did not work for me meaning UI does not show any exception. It only throws exception when you run a job.

In an alternative approach and also you mentioned this in one of your earlier comments that we can allow blank mimeType. If the mimeType is blank we can set it to null in ElasticSearch and with current implementation if mimeType is null it will be set to default application/json value. Does this sound acceptable?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dao is the wrong place to do the checks. You have to do it in the constructor of configuration/ElasticSearch. Alternatively one could extend the base class LogstashIndexer with a validate method and call the validate method in the LogstashConfiguration.configure before save()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwinter69 Pushed updated code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakub-bochenski Updated code with your review comments

@@ -171,6 +172,10 @@ public void migrateData()
@Override
public boolean configure(StaplerRequest staplerRequest, JSONObject json) throws FormException
{

// Keep a reference to the current configuration to revert back in case of an exception.
LogstashIndexer<?> currentIndexer = logstashIndexer;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe previousIndexer would be a better name? "Current" might be understood as referring to the one being processed now


try {
// Validate mimetype and throw exception in case found invalid.
if (logstashIndexer instanceof ElasticSearch) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just declare a noop validate() method on LogstashIndexer and override it in ElasticSearch implementation?

@@ -180,6 +185,19 @@ public boolean configure(StaplerRequest staplerRequest, JSONObject json) throws
{
activeIndexer = logstashIndexer;
}

try {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check need to be done before assigning it to activeIndexer as activeIndexer is the one that is really used.

*
* @throws Exception
*/
public void validate() throws Exception {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] it would be better to have something more specific than Exception in the signature (e.g. ConstraintViolationException); OTOH it would force us to catch and wrap MimeTypeParseException, so I'm not sure if it's worth it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH it would force us to catch and wrap MimeTypeParseException, so I'm not sure if it's worth it

Thats the reason I didn't use specific exception there. In this case I feel we should not bother about the specific exception. If validate() throws any exception we know it is due to invalid indexer config so simply displaying wrapped exception would be sufficient . Let me know if you feel there is a value in that I can make a change.

@VamanKulkarni
Copy link
Author

@jakub-bochenski @mwinter69 Any further comments ?

@jakub-bochenski
Copy link

LGTM

@jakub-bochenski
Copy link

@mwinter69 please let me know if you approve so that I can merge it

Copy link

@mwinter69 mwinter69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except this minor thing it looks good

@@ -171,15 +172,33 @@ public void migrateData()
@Override
public boolean configure(StaplerRequest staplerRequest, JSONObject json) throws FormException
{

// Keep a reference to the current configuration to revert back in case of an exception.
LogstashIndexer<?> previousIndexer = logstashIndexer;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need this variable, We already have activeIndexer which holds the active configuration and can revert to this.

// Exception will be thrown here so that it gets displayed on UI.
// But before that revert back to original configuration (in-memory)
// so that when user refreshes the configuration page, last saved settings will be displayed again.
logstashIndexer = previousIndexer;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert to activeIndexer

@VamanKulkarni
Copy link
Author

@mwinter69 Updated code with review comments

@jakub-bochenski
Copy link

@VamanKulkarni looks like the tests are failing with NPE now

@VamanKulkarni
Copy link
Author

@jakub-bochenski You are right.. I missed to add a file in the commit. Updated the PR now.

@jakub-bochenski jakub-bochenski merged commit d08c486 into jenkinsci:master May 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants