Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: delegation token parameter should be renamed when using WebHDFS as storage #16742

Open
1 of 2 tasks
jason-heo opened this issue Oct 31, 2024 · 9 comments
Open
1 of 2 tasks
Labels
C-bug Category: something isn't working C-want-help Category: want help good first issue Category: good first issue

Comments

@jason-heo
Copy link

Search before asking

  • I had searched in the issues and found no similar issues.

Version

v1.2.615-a8da519a63(rust-1.81.0-nightly-2024-08-19T03:29:16.412442905Z)

What's Wrong?

Hello, Im new to Databend.

I'm trying to test Databend with my data on secure hadoop.

I've setup Query node as describe in Deploying with HDFS using WebHDFS

My current setting is like this:

[storage.webhdfs]
endpoint_url = "https://<hostname>:<port>"
root = "/path/to"
# if your webhdfs needs authentication, uncomment and set with your value
delegation = "<token>"

When I run INSERT INTO I got an error:

error: APIError: ResponseError with 3002: PermissionDenied (persistent) at Writer::close, context: {
  uri: https://<hostname>:<port>/webhdfs/v1/path/to/xxx.parquet?op=CREATE&...&delegation_token=<token>,
  response: Parts {
    status: 401,
    version: HTTP/1.1,
    headers: {
      ...
    }
  }
}

I think delegation_token should be renamed to delegation.

You can see not delegation_token but delegation on WebHDFS REST API -> Authentication

image

I've tested using curl and confirm that following url worked well.

$ curl -i "https://<hostname>:<port>/webhdfs/v1/path/to?op=LISTSTATUS&delegation=<token>"

How to Reproduce?

Described above.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@jason-heo jason-heo added the C-bug Category: something isn't working label Oct 31, 2024
@jason-heo
Copy link
Author

It seems that this issue is related to OpenDAL.

After checking reproducable via OpenDAL API, let me report to OpenDAL.

@Xuanwo
Copy link
Member

Xuanwo commented Nov 1, 2024

Thank you @jason-heo for this. Maybe we can also accpet delegation during parsing the url.

@Xuanwo
Copy link
Member

Xuanwo commented Nov 1, 2024

This change should be simple that parse delegation as delegation_token.

@Xuanwo Xuanwo added C-want-help Category: want help good first issue Category: good first issue labels Nov 1, 2024
@jason-heo
Copy link
Author

jason-heo commented Nov 6, 2024

Hello, @Xuanwo

If my understanding is correct, to resolve this isssue one needs to replace delegation_token to delegation in url parameter in some codes.

Could you point me out which source code should be modified for this issue?

I tried to find delegation_token but it seems that only test codes have delegation_token.

git grep -i delegation_token
src/meta/proto-conv/tests/it/user_proto_conv.rs:                delegation: "<delegation_token>".to_string(),
src/meta/proto-conv/tests/it/user_stage.rs:                delegation: "<delegation_token>".to_string(),
src/meta/proto-conv/tests/it/v030_user_stage.rs:                delegation: "<delegation_token>".to_string(),
src/meta/proto-conv/tests/it/v031_copy_max_file.rs:                delegation: "<delegation_token>".to_string(),

Although my background is Java, Scala and Python, I'll try to resolve this is if simple.

Sorry, my team's task priority has been changed.

Thanks.

@notauserx
Copy link
Contributor

notauserx commented Nov 18, 2024

I reviewed the issue, and it seems the solution lies in the WebhdfsBuilder implementation in OpenDAL.

https://github.com/apache/opendal/blob/0f45b5d34a95d21760a7bb7339a9103ae3311a53/core/src/services/webhdfs/backend.rs#L177

I have a proposed fix in the PR: https://github.com/apache/opendal/pull/5342/files

@Xuanwo
Copy link
Member

Xuanwo commented Nov 18, 2024

I reviewed the issue, and it seems the solution lies in the WebhdfsBuilder implementation in OpenDAL.

Hi, please fix the builder at databend side instead.

@notauserx
Copy link
Contributor

notauserx commented Nov 18, 2024

Hi, thanks for pointing that out! Could you please guide me on where exactly the builder is located on the Databend side? A pointer to the relevant file or module would be helpful.

Also what is the rationale behind keeping the auth value as delegation_token on the OpenDAL side?
https://github.com/apache/opendal/blob/0f45b5d34a95d21760a7bb7339a9103ae3311a53/core/src/services/webhdfs/backend.rs#L240

@Xuanwo
Copy link
Member

Xuanwo commented Nov 19, 2024

Hi, thanks for pointing that out! Could you please guide me on where exactly the builder is located on the Databend side? A pointer to the relevant file or module would be helpful.

let delegation = l.connection.get("delegation").cloned().unwrap_or_default();


Oh, I see. Databend does use delegation, the real issue is opendal doesn't build the correct request.

@Xuanwo
Copy link
Member

Xuanwo commented Nov 19, 2024

This issue will be fixed by apache/opendal#5342

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: something isn't working C-want-help Category: want help good first issue Category: good first issue
Projects
None yet
Development

No branches or pull requests

3 participants