Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GITHUB_TOKEN for HTTP Requests to github.com #871

Closed
wants to merge 8 commits into from

Conversation

Listener430
Copy link
Collaborator

@Listener430 Listener430 commented Dec 19, 2024

what

This PR contains the code that checks if GITHUB_TOKEN env variable is set, and if it does, the requests to github.com are done with the provided token.

The following functionality was added :

  • vendoring - done
  • check for the latest atmos version - done
  • read atmos.yaml from remote sources - done, mdx docs are not updated yet (subject to the scope sign off)
  • import stack configs from remote sources - done

why

Bypass github ratelimts and allow access to the files in the private repos.

Testing

atmos.yaml local vs. remote

This testcase is based on the green update box that's shown depedning on the settings from the atmos.yaml file.
The local atmos yaml version has this defaulted to 1 h (see section A below), i.e. if one runs the version command in less than a hour the update box isn't shown (sections B and C at the screenshot below).
atmos_yaml_test_no_box_as_per_local_file_setting

Then a slightly adjusted version of atmos.yaml (see section D below, the timeout is set to a smaller value vs 1000 ms in the original local file above, i.e. if applied correctly the greenbox should be shown every time a version command is run)
new_atmos_yaml_in_remove_private_repo

Since GITHUB_TOKEN isn't yet set, the private repo isn't available.

bad_credentials

Then ATMOS_REMOTE_CONFIG_URL and GITHUB_TOKEN env variables are set (see section E below),
and in section F the version command is run twice and the greenbox is shown everytime (i.e. the updated timeout setting from the remote repo is respected)

atmos_read_config_from_remote_after_token_seetting

NB. This test case also covers the version command that honors GITHUB_TOKEN value too.

remote vendoring

Make sure that GITHUB_TOKEN variable is empty brefore running this test.
In the vendoring example, a new component is added. The component is located in the private repo, it is a standard vendor.yaml from demo-vendoring example with the following extra section (shown in section A at the screenshot below)

`

private_repo_isn't_accebile_without_a_token

Then vednor pull command is run that fails as the repo is private (see section B at the screenshot above).
Also there were no files added to components folder (refer to section C at the screenshot above)

Then the token variable is assigned as per section D at the screenshot below

successfully_retrieved_a_file_from_github_repo

The command succesfully completes (see section E) and the file from the private repo is downloaded into the specified locaion (F)

local stacks vs remote stacks

The test is based on quick start simple example, i.e. first it is run with local stacks on (shows weather for stockholm), then is it run with a remote stack (boston). There's a new flag added for remote stack uploading to specify the folder where it should land once downloaded. E.g.

./../../atmos terraform apply station -s https://github.com/analitikasi/CoonectorDescribed/blob/main/quick-start-simple/stacks/deploy/dev.yaml --folder ./stacks/deploy

Here is the local stack file from the sample
local_stack_view

Here is the output when run with local stack

local_stack_sample

After that, a remote stack is checked, it is located in a private repo and looks like this (note it says Boston unlike the value from the local stack so it is possible to differenciate local vs remote stacks)

remote_stack_boston

and since GITHUB_TOKEN variable isn't set it results in 404 error:

token_not_set_404_error

If the token is set, the output looks like this

remote_stack_successfully_loaded

Please note that the new flag was introduced to specify the location where the remote stacks have to be downloaded to.

references

Link to ticket - https://linear.app/cloudposse/issue/DEV-2778/atmos-should-support-github-token-for-http-requests-to-githubcom

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features
    • Introduced functionality to download files directly from GitHub repositories.
    • Added a new method to parse GitHub URLs for better handling of repository details.
    • Added a --folder flag to specify the download location for stack files.
    • Enhanced configuration management to support loading remote configurations from GitHub URLs.
  • Bug Fixes
    • Improved error handling for GitHub source detection and file downloads.
    • Enhanced feedback for missing required fields in vendor configurations.
  • Tests
    • Added a new test case to validate vendor execution with a GitHub token, enhancing test coverage.

@Listener430 Listener430 requested a review from aknysh December 19, 2024 15:08
@Listener430 Listener430 self-assigned this Dec 19, 2024
@Listener430 Listener430 requested a review from a team as a code owner December 19, 2024 15:08
@mergify mergify bot added the triage Needs triage label Dec 19, 2024
Copy link
Contributor

coderabbitai bot commented Dec 19, 2024

📝 Walkthrough

Walkthrough

The pull request introduces enhanced support for GitHub source handling in Atmos, focusing on improving the vendor command's ability to download files directly from GitHub repositories. New utility functions for parsing GitHub URLs and downloading files are added, enabling more flexible configuration management. The modifications include updating the source type determination logic, introducing temporary directory handling, and improving error management for GitHub-based sources.

Changes

File Change Summary
internal/exec/vendor_utils.go - Updated determineSourceType method signature
- Added tempDir variable for temporary file handling
- Enhanced GitHub source detection and processing logic
pkg/utils/file_utils.go - Added ParseGitHubURL function to extract GitHub URL components
pkg/utils/github_utils.go - Added DownloadFileFromGitHub function to retrieve files from GitHub
- Added newGitHubClient function for creating GitHub clients
- Updated GetLatestGitHubRepoRelease for better error handling
pkg/vender/vendor_config_test.go - Added TestExecuteAtmosVendorInternalWithToken test case
cmd/terraform.go - Added --folder flag for downloading stack files from GitHub
internal/exec/utils.go - Updated commonFlags to include --folder
pkg/config/config.go - Introduced processRemoteConfigFile for loading remote configurations

Assessment against linked issues

Objective Addressed Explanation
Support GITHUB_TOKEN for GitHub requests [DEV-2778]

Possibly related PRs

Suggested reviewers

  • osterman
  • aknysh
  • nitrocode
  • johncblandii

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary or @auto-summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @auto-title anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
internal/exec/vendor_utils.go (3)

268-269: Consider limiting the scope of the global variable.

Declaring "tempDir" at the package level or near global scope might be avoided if you can define it strictly where it’s used. Reducing scope simplifies testing and prevents accidental misuse.


371-388: Handle edge cases in GitHub file downloading.

This logic for saving downloaded content to a temp directory looks fine. Consider validating that the file path from GitHub won't collide with local paths. Retries might also help if the download fails.


526-551: Refine the multiple-return boolean approach.

Returning four booleans can be confusing. Consider introducing a small struct or an enum-like pattern to improve clarity and reduce the likelihood of confusion at the call site.

pkg/utils/github_utils.go (1)

37-69: Gracefully handle connection or rate limit errors.

The download function is straightforward. For better resilience, consider handling transient network failures with retries or more descriptive messages for rate limit errors.

pkg/vender/vendor_config_test.go (1)

146-195: Add negative test scenarios.

This test successfully ensures the flow works with a valid token. Adding scenarios to verify behavior when the token is missing or invalid might strengthen coverage.

pkg/utils/file_utils.go (1)

246-268: Consider handling GitHub Enterprise URLs.

The function parses GitHub URLs in standard blob format. For broader use, consider supporting GitHub Enterprise domains or “raw” URLs. This might improve generality.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80058a8 and fb72316.

📒 Files selected for processing (4)
  • internal/exec/vendor_utils.go (4 hunks)
  • pkg/utils/file_utils.go (1 hunks)
  • pkg/utils/github_utils.go (2 hunks)
  • pkg/vender/vendor_config_test.go (1 hunks)
🔇 Additional comments (1)
internal/exec/vendor_utils.go (1)

319-328: Creation of a temp directory is correct.

Good job creating a temporary directory and deferring its cleanup. This approach helps avoid leaving junk files behind. No further changes needed.

coderabbitai[bot]
coderabbitai bot previously approved these changes Dec 19, 2024
@mergify mergify bot removed the triage Needs triage label Dec 19, 2024
@osterman osterman changed the title Support GITHUB_TOKEN for HTTP Requests to github.com Support GITHUB_TOKEN for HTTP Requests to github.com Dec 19, 2024
pkg/utils/github_utils.go Outdated Show resolved Hide resolved
pkg/utils/github_utils.go Outdated Show resolved Hide resolved
@osterman osterman marked this pull request as draft December 19, 2024 15:47
@osterman osterman added the minor New features that do not break anything label Dec 19, 2024
coderabbitai[bot]
coderabbitai bot previously approved these changes Dec 20, 2024
@Listener430 Listener430 marked this pull request as ready for review December 27, 2024 21:22
@Listener430 Listener430 requested a review from osterman December 27, 2024 21:22
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (9)
pkg/config/config.go (2)

19-19: Consolidate duplicate imports for utils.

You have imported the same package twice, once as utils and once aliased as u. If both references are needed, remove one import statement to keep it consistent.

19c19
-       "github.com/cloudposse/atmos/pkg/utils"
-       u "github.com/cloudposse/atmos/pkg/utils"
+       u "github.com/cloudposse/atmos/pkg/utils"

403-432: Consider extending support beyond GitHub.

processRemoteConfigFile currently only handles github.com. If you foresee using GitHub Enterprise or other Git servers, consider refactoring to support different domains.

cmd/terraform.go (1)

49-70: Warn about potential file overwrite.

When writing the downloaded file to disk, consider checking whether the file already exists to avoid unexpected overwrites.

+if _, err := os.Stat(localPath); err == nil {
+  // Optionally log or prompt before overwriting
+}
err = os.WriteFile(localPath, data, 0o644)
pkg/utils/github_utils.go (3)

32-46: The 2-second timeout might be too strict.

For slower networks or large release data, consider an environment-based or configurable timeout.


52-86: Five-second timeout may be short for bigger files.

Downloading large files from GitHub might need a slightly longer (or configurable) timeout. Otherwise, this function is clear.


88-95: Support for GitHub Enterprise is currently omitted.

IsGithubURL only matches github.com. If GitHub Enterprise usage is likely, consider generalizing the URL check.

pkg/vender/vendor_config_test.go (1)

147-196: Test coverage for token usage is a welcome addition.

You could add a negative test (e.g. invalid token) to confirm proper error handling.

pkg/utils/file_utils.go (1)

246-265: LGTM! Consider enhancing error messages.

The function correctly parses GitHub URLs and handles both 'blob' and 'raw' formats. The implementation is clean and efficient.

Consider enhancing error messages to include the actual URL for better debugging:

-		return "", "", "", "", fmt.Errorf("invalid GitHub URL format")
+		return "", "", "", "", fmt.Errorf("invalid GitHub URL format: %s", rawURL)
internal/exec/vendor_utils.go (1)

533-561: Consider breaking down the function for better maintainability.

While the implementation is correct, the function is handling multiple responsibilities.

Consider splitting into smaller, focused functions:

func determineSourceType(uri *string, vendorConfigFilePath string) (bool, bool, bool, bool) {
+	useOciScheme := isOciScheme(uri)
+	useLocalFileSystem, sourceIsLocalFile := checkLocalFileSystem(uri, vendorConfigFilePath)
+	isGitHubSource := checkGitHubSource(uri)
+	return useOciScheme, useLocalFileSystem, sourceIsLocalFile, isGitHubSource
+}
+
+func isOciScheme(uri *string) bool {
+	useOciScheme := strings.HasPrefix(*uri, "oci://")
+	if useOciScheme {
+		*uri = strings.TrimPrefix(*uri, "oci://")
+	}
+	return useOciScheme
+}
+
+func checkLocalFileSystem(uri *string, vendorConfigFilePath string) (bool, bool) {
+	parsedURL, err := url.Parse(*uri)
+	if err != nil || parsedURL.Scheme == "" || parsedURL.Host == "" {
+		if absPath, err := u.JoinAbsolutePathWithPath(vendorConfigFilePath, *uri); err == nil {
+			*uri = absPath
+			return true, u.FileExists(*uri)
+		}
+	}
+	return false, false
+}
+
+func checkGitHubSource(uri *string) bool {
+	parsedURL, err := url.Parse(*uri)
+	return err == nil && parsedURL.Host == "github.com" && parsedURL.Scheme == "https"
}
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fb72316 and 450a6f2.

📒 Files selected for processing (8)
  • cmd/terraform.go (3 hunks)
  • internal/exec/stack_processor_utils.go (1 hunks)
  • internal/exec/utils.go (1 hunks)
  • internal/exec/vendor_utils.go (4 hunks)
  • pkg/config/config.go (5 hunks)
  • pkg/utils/file_utils.go (1 hunks)
  • pkg/utils/github_utils.go (1 hunks)
  • pkg/vender/vendor_config_test.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • internal/exec/stack_processor_utils.go
🔇 Additional comments (11)
pkg/config/config.go (3)

7-7: Good addition of the net/url import.

The usage for parsing remote configurations seems straightforward.


105-106: Thanks for documenting environment variables.

The comments clarify the order of configuration resolution using ENV vars.


207-218: Remote config loading looks robust.

The logic correctly checks for the remote config variable, attempts retrieval, and returns errors appropriately.

cmd/terraform.go (2)

4-6: Neat addition of imports for file handling and string processing.

No issues found with these imports.


83-87: Documentation for --folder is concise.

This new flag is self-explanatory and aligns well with the new GitHub download feature.

internal/exec/utils.go (1)

24-24: The --folder flag inclusion looks consistent.

This aligns with the logic in cmd/terraform.go.

pkg/utils/github_utils.go (3)

5-8: Import updates for GitHub support look appropriate.

These libraries are well-chosen for encoding, URL parsing, etc.


12-12: OAuth2 import is essential for authenticated requests.

No changes are necessary.


15-30: Authentication fallback is a helpful approach.

This gracefully handles both token and non-token scenarios.

internal/exec/vendor_utils.go (2)

271-272: LGTM! Proper temporary directory handling.

The implementation follows best practices for temporary file management with proper cleanup and error handling.

Also applies to: 322-331


376-395: LGTM! Secure and well-structured GitHub source handling.

The implementation properly handles GitHub sources with appropriate error handling and logging. The code follows security best practices by validating URLs before using tokens.

Comment on lines +267 to +274
// ParseFilenameFromURL extracts the file name from a URL
func ParseFilenameFromURL(url string) string {
parts := strings.Split(url, "/")
if len(parts) == 0 {
return ""
}
return parts[len(parts)-1] // e.g. "dev.yaml"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add validation and error handling for robustness.

The function should validate input and handle edge cases properly.

Consider this more robust implementation:

-func ParseFilenameFromURL(url string) string {
+func ParseFilenameFromURL(rawURL string) (string, error) {
+	if rawURL == "" {
+		return "", fmt.Errorf("empty URL provided")
+	}
+
+	parsedURL, err := url.Parse(rawURL)
+	if err != nil {
+		return "", fmt.Errorf("invalid URL: %w", err)
+	}
+
	parts := strings.Split(parsedURL.Path, "/")
	if len(parts) == 0 {
-		return ""
+		return "", fmt.Errorf("URL has no path components: %s", rawURL)
	}
-	return parts[len(parts)-1] // e.g. "dev.yaml"
+	filename := parts[len(parts)-1]
+	if filename == "" {
+		return "", fmt.Errorf("URL ends with a slash: %s", rawURL)
+	}
+	return filename, nil
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// ParseFilenameFromURL extracts the file name from a URL
func ParseFilenameFromURL(url string) string {
parts := strings.Split(url, "/")
if len(parts) == 0 {
return ""
}
return parts[len(parts)-1] // e.g. "dev.yaml"
}
// ParseFilenameFromURL extracts the file name from a URL
func ParseFilenameFromURL(rawURL string) (string, error) {
if rawURL == "" {
return "", fmt.Errorf("empty URL provided")
}
parsedURL, err := url.Parse(rawURL)
if err != nil {
return "", fmt.Errorf("invalid URL: %w", err)
}
parts := strings.Split(parsedURL.Path, "/")
if len(parts) == 0 {
return "", fmt.Errorf("URL has no path components: %s", rawURL)
}
filename := parts[len(parts)-1]
if filename == "" {
return "", fmt.Errorf("URL ends with a slash: %s", rawURL)
}
return filename, nil
}

@@ -100,6 +102,8 @@ func InitCliConfig(configAndStacksInfo schema.ConfigAndStacksInfo, processStacks
// system dir (`/usr/local/etc/atmos` on Linux, `%LOCALAPPDATA%/atmos` on Windows)
// home dir (~/.atmos)
// current directory
// ENV var ATMOS_CLI_CONFIG_PATH
// ENV var ATMOS_REMOTE_CONFIG_URL from GITHUB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This task should not be implementing a new ENV for fetching remote config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Listener430 lets reduce the scope of this PR. Delete anything that introduces new functionality (like remote atmos configs, new environment variables, etc).

Comment on lines +1932 to +1948
//Check if its Github remote URL to single file
// Shoudl be decommised as the direct hook into map does not work
/*parsedURL, err := url.Parse(filePath) // Parse the URL
if err != nil {
u.LogInfo(schema.AtmosConfiguration{}, fmt.Sprintf("Filepath is local: %s", filePath))
} else {
if parsedURL.Host == "github.com" && parsedURL.Scheme == "https" {
u.LogDebug(schema.AtmosConfiguration{}, fmt.Sprintf("Fetching GitHub source: %s", filePath))
fileContents, err := u.DownloadFileFromGitHub(filePath)
if err != nil {
return "", fmt.Errorf("failed to download GitHub file: %w", err)
}
getFileContentSyncMap.Store(filePath, fileContents)
return string(fileContents), nil
}
}
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete commented coded

Comment on lines +403 to +404
// processRemoteConfigFile attempts to download and merge a remote atmos.yaml
// from a URL. It currently only supports GitHub URLs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove from this PR. We need more requirements before implementing this.

Comment on lines +52 to +53
// DownloadFileFromGitHub downloads a file from a GitHub repository using the GitHub API.
func DownloadFileFromGitHub(rawURL string) ([]byte, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be good, but it's changing the implementation in a risky way. The original request was simply to add a bearer token header, not change the implementation to use the GitHub API.

Copy link
Collaborator Author

@Listener430 Listener430 Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@osterman func DownloadFileFromGitHub(rawURL string) ([]byte, error) - this is a whole new function that i added, its being used in the new functionality I proposed for vednoring, atmos.yaml with env var and in stacks with folder flag - all 3 new parts I added in this PR. Its not used anywhere else. The existion function func GetLatestGitHubRepoRelease used github API before me and I just added a func to configure a client with auth token to it (nitive functionality). So the one above is a whole new, if we dont need that right now for vendoring, stacks and atmos.yaml, i can take the fucn out. Let me know whats best.

@Listener430
Copy link
Collaborator Author

@osterman as agreed closing this one due to descoping. The version part left is located in the new PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor New features that do not break anything
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants