Skip to content
This repository has been archived by the owner on Jul 22, 2021. It is now read-only.

NIFIREG-415 Add Support for Unicode in X-ProxiedEntitiesChain #302

Closed

Conversation

kevdoran
Copy link
Contributor

@kevdoran kevdoran commented Sep 11, 2020

  • Adds detection and encoding of non-ascii characters to creation of chain
  • Adds unit tests and integration tests that use proxied entities with Unicode
  • Refactors ProxiedEntityRequestConfig to compute header value in constructor

Overview of these changes.

This changes the creation the X-ProxiedEntitiesChain header values to:

  1. Detect non-ascii characters in an input identity/DN
  2. If non-ascii characters are present, base64 encode the value and wrap it in additional <angle braces>

For example, a proxy chain of

Алйс, CN=nifi.apache.org

Will result in:

X-ProxiedEntitiesChain: <<0JDQu9C50YE=>><CN=nifi.apache.org>

Additionally, logic for tokenizing these header values was updated to:

  1. Detect if a token is encoded (by the presence of additional <angle braces>)
  2. If encoded, decode it as part of the tokenization process.

These changes should be completely backwards compatible, as entities containing only ascii characters are encoded exactly the same as they previously were. (Previously, entities containing non-ascii characters would not survive a round trip hop between nodes, as per NIFI-7744, so I am not worried about that case.)

A number of implementation approaches were considered, including ones based on RFC-2231 and RFC-8187. Ultimately, approaches that relied on percent encoding were rejected as percent-encoding outside of URL encoding are not widespread in standard libraries and difficult to implement correctly given complexity of unicode and things such as surragote characters. Given that in theory, third-party components such as proxies may need to implement this logic, base64 was chosen as is it widely available in all languages and frameworks (particularly in reverse proxies, where base64 is used as part of basic auth header formation).

The use of double <<angle braces>> to indicate encoding was chosen as it allows for an easy way for a decoder to determine if the value is encoded (necessary because encoding is optional for purely ascii entities), and because the reserved characters < and > are already protected/escaped in our sanitization process.

I considered but decided against always base64 encoding the header value, because (1) this maintains backwards compatibility between nifi and registry on different versions and (2) plaintext, non-encoded values, which are still the majority of use-cases, are easier for users to troubleshoot.

If this approach is satisfactory, let me know and I will make corresponding changes in NiFi. As it stands, I will have to re-implement most of this in the NiFi. Another option would be extracting most of this logic into a more sharable module in NiFi registry (or move to nifi-registry-utils perhaps), so that it could be imported and used by NiFi. I decided against this for now, as ultimately this is a good candidate for nifi-standard-libs and refactoring short-term would be more messy/risky (the current implementation in nifi-registry-security-utils requires apache commons which nifi-registry-utils does not have).


Thank you for submitting a contribution to Apache NiFi Registry.

Please provide a short description of the PR here:

Description of PR

Enables X functionality; fixes bug NIFIREG-YYYY.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with NIFIREG-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not squash or use --force when pushing to allow for clean monitoring of changes.

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi-registry folder?
  • Have you written or updated unit tests to verify your changes?
  • Have you verified that the full build is successful on JDK 8?
  • Have you verified that the full build is successful on JDK 11?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-registry-assembly?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-registry-assembly?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.

- Adds detection and encoding of non-ascii characters to creation of chain
- Adds unit tests and integration tests that use proxied entities with Unicode
- Refactors ProxiedEntityRequestConfig to compute header value in constructor
@kevdoran kevdoran force-pushed the NIFIREG-415-unicode-proxy-identities branch from 8bbb417 to 0afa3a1 Compare September 14, 2020 18:05
@kevdoran
Copy link
Contributor Author

Rebased on main to (hopefully) fix GHA CI builds

@kevdoran
Copy link
Contributor Author

Thanks for the review, @alopresto. I've updated my PR to incorporate your feedback. Turns out the weird handling of leading/trailing </> was to properly tokenize the anonymous proxied entity <>, so I had to account for that in order to apply your simplification, but I think that is more readable and explicit this way. I also added some test cases that specifically cover that.

@kevdoran kevdoran force-pushed the NIFIREG-415-unicode-proxy-identities branch from 0b68d02 to fd000b4 Compare September 24, 2020 14:03
@kevdoran
Copy link
Contributor Author

Whoops, missed some changes in my last commit. Updated and should be good now.

@kevdoran
Copy link
Contributor Author

@alopresto I know you are busy with other items, but if you have a chance in the next couple weeks, would be great to get a sign-off on this

@bbende
Copy link
Contributor

bbende commented Oct 13, 2020

Updates look good, all integration tests passing locally, going to merge, thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants