Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ec2: Support double encoded userdata #4276

Merged
merged 2 commits into from
Aug 14, 2023

Conversation

nmeyerhans
Copy link
Contributor

Proposed Commit Message

ec2: Support double encoded userdata

The Amazon APIs expect userdata to be base64 encoded when passed as
input to e.g. RunInstances.  A number of tools, including the AWS CLI,
perform this base64 encoding implicitly, but it's common for users to
base64 encode the data prior to passing it to them. This results in two
layers of base64 encoding and effectively results in a failed EC2
launch.  This change adds the ability to decode the redundant layer of 
encoding.

Fixes https://github.com/amazonlinux/amazon-linux-2023/issues/401

Additional Context

Amazon Linux has long carried a patch to cloud-init to handle this case of double base64 encoding. This patch broke when it was forward ported to python3 and a more recent cloud-init during the Amazon Linux 2023 development. By submitting the fixed version here, we hope to remove this custom patch and also make this functionality work for users of other distros.

Test Steps

Start with any valid cloud-config. This can be passed to the AWS CLI, which will take care of base64 encoding. This works as expected.

Base64 encode the userdata, and pass the encoded results to the AWS CLI, which will re-encode it with a second layer of base64 encoding. Example:

$ cat double-base64-test.yaml                                      
#cloud-config
write_files:
  - content: |
      This is some content
    path: /test-file

$ base64 -w0 double-base64-test.yaml | tee double-base64-test.yaml.b64
I2Nsb3VkLWNvbmZpZwp3cml0ZV9maWxlczoKICAtIGNvbnRlbnQ6IHwKICAgICAgVGhpcyBpcyBzb21lIGNvbnRlbnQKICAgIHBhdGg6IC90ZXN0LWZpbG

$ aws ec2 run-instances --user-data file://double-base64-test.yaml.b64 ...

Launch an instance with the base64 encoded data. Without this change, cloud-init fails to process userdata and records a message similar to the following in cloud-init.log :

2023-07-21 21:09:00,336 - __init__.py[WARNING]: Unhandled non-multipart (text/x-not-multipart) userdata: 'b'I2Nsb3VkLWNvbmZpZwp3cml0'...'

With this change, the userdata is decoded and executed as intended.

Tested on Amazon Linux 2023 and Debian 12 and unstable.

Checklist:

Note that AWS has already signed the CLA on behalf of employees, which should cover this contribution. I haven't added my name to tools/.github-cla-signers, but can do so if you need it.

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes here LGTM! Thanks for the contribution.

I haven't added my name to tools/.github-cla-signers, but can do so if you need it

Yes please. We'll need that along with fixes to the failing lint job in order for this PR to pass CI. Once those two things happen, we can merge this.

@TheRealFalcon TheRealFalcon self-assigned this Aug 1, 2023
Noah Meyerhans added 2 commits August 4, 2023 07:13
This function can be used elsewhere and is not specific to Hetzner

Signed-off-by: Noah Meyerhans <[email protected]>
Experience has shown that there are a number of tools and common
patterns that effectively doubly base64 encode userdata.  This change
replaces a patch previously carried by Amazon Linux that was
incorrectly ported to cloud-init 22.2.2 in Amazon Linux 2023 add
addresses amazonlinux/amazon-linux-2023#401

Signed-off-by: Noah Meyerhans <[email protected]>
@nmeyerhans
Copy link
Contributor Author

Changes here LGTM! Thanks for the contribution.

I haven't added my name to tools/.github-cla-signers, but can do so if you need it

Yes please. We'll need that along with fixes to the failing lint job in order for this PR to pass CI. Once those two things happen, we can merge this.

Thanks, updated.

Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@TheRealFalcon TheRealFalcon merged commit 6588373 into canonical:main Aug 14, 2023
25 checks passed
nmeyerhans pushed a commit to nmeyerhans/cloud-init that referenced this pull request Aug 24, 2023
canonical#4276 uncovered an issue
with the initialization of the return value for
get_instance_userdata().  The return value was initialized with

user_data = ""

which is a str class.  It then calls url_helper.read_file_or_url(),
which attempts to retrieve user-data content from IMDS.
read_file_or_url() returns its results as a bytes object, which is
then passed directly up to the caller.  In the event that
read_file_or_url() does not successfully retrieve
content (e.g. because it was given a file:// path to a nonexistent
file or an http:// path that generates a 404 code), an exception is
raised an get_instance_userdata returns the string object initially
stored in user_data.

Rather than make the caller cope with return data potentially encoded
as either bytes or str, this commit changes the initialization of
user_data to an empty bytes object, ensuring type consistency in
get_instance_userdata()'s return value.

Fixes canonical#4386

Signed-off-by: Noah Meyerhans <[email protected]>
nmeyerhans pushed a commit to nmeyerhans/cloud-init that referenced this pull request Aug 24, 2023
canonical#4276 uncovered an issue
with the initialization of the return value for
get_instance_userdata().  The return value was initialized with

user_data = ""

which is a str class.  It then calls url_helper.read_file_or_url(),
which attempts to retrieve user-data content from IMDS.
read_file_or_url() returns its results as a bytes object, which is
then passed directly up to the caller.  In the event that
read_file_or_url() does not successfully retrieve
content (e.g. because it was given a file:// path to a nonexistent
file or an http:// path that generates a 404 code), an exception is
raised an get_instance_userdata returns the string object initially
stored in user_data.

Rather than make the caller cope with return data potentially encoded
as either bytes or str, this commit changes the initialization of
user_data to an empty bytes object, ensuring type consistency in
get_instance_userdata()'s return value.

Fixes canonical#4386

Signed-off-by: Noah Meyerhans <[email protected]>
holmanb pushed a commit that referenced this pull request Aug 24, 2023
#4276 uncovered an issue
with the initialization of the return value for
get_instance_userdata().  The return value was initialized with

user_data = ""

which is a str class.  It then calls url_helper.read_file_or_url(),
which attempts to retrieve user-data content from IMDS.
read_file_or_url() returns its results as a bytes object, which is
then passed directly up to the caller.  In the event that
read_file_or_url() does not successfully retrieve
content (e.g. because it was given a file:// path to a nonexistent
file or an http:// path that generates a 404 code), an exception is
raised an get_instance_userdata returns the string object initially
stored in user_data.

Rather than make the caller cope with return data potentially encoded
as either bytes or str, this commit changes the initialization of
user_data to an empty bytes object, ensuring type consistency in
get_instance_userdata()'s return value.

Fixes GH-4386

Signed-off-by: Noah Meyerhans <[email protected]>
@nmeyerhans nmeyerhans deleted the ec2-double-encoded-userdata branch September 22, 2023 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants