Skip to content

feat(stackable-operator): Add git-sync support #1024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

siegfriedweber
Copy link
Member

@siegfriedweber siegfriedweber commented May 7, 2025

Description

Add git-sync support

Currently used in:

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes
# Author
- [x] Changes are OpenShift compatible
- [x] Integration tests passed (for non trivial changes)
# Reviewer
- [ ] Code contains useful comments
- [ ] (Integration-)Test cases added
- [ ] Documentation added or updated
- [ ] Changelog updated
- [ ] Cargo.toml only contains references to git tags (not specific commits or branches)
# Acceptance
- [ ] Feature Tracker has been updated
- [ ] Proper release label has been added

@siegfriedweber siegfriedweber marked this pull request as ready for review May 8, 2025 14:31
@siegfriedweber siegfriedweber requested a review from a team May 8, 2025 14:31
@siegfriedweber siegfriedweber moved this to Development: Waiting for Review in Stackable Engineering May 8, 2025
@Techassi Techassi self-requested a review May 8, 2025 14:31
@Techassi Techassi moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering May 9, 2025
@Techassi Techassi changed the title Add git-sync support feat(stackable-operator): Add git-sync support May 9, 2025
Copy link
Member

@Techassi Techassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First initial (partial, I didn't look at the unit tests yet) review.


mod v1alpha1_impl;

#[versioned(version(name = "v1alpha1"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Nice job versioning this right from the start!

Comment on lines +21 to +22
/// The git repository URL that will be cloned, for example: `https://github.com/stackabletech/airflow-operator`.
pub repo: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: Is there any particular reason why this field is named repo? I think we should name it repository.

note: Additionally, the type of this field should be Url instead of a plain String.

Comment on lines +30 to +35
/// Location in the Git repository containing the resource.
///
/// It can optionally start with `/`, however, no trailing slash is recommended.
/// An empty string (``) or slash (`/`) corresponds to the root folder in Git.
#[serde(default = "GitSync::default_git_folder")]
pub git_folder: PathBuf,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: All other fields document the default value. This one should then also do that.

Comment on lines +26 to +28
/// Since git-sync v4.x.x this field is mapped to the flag `--ref`.
#[serde(default = "GitSync::default_branch")]
pub branch: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This is a perfect case for a future v1alpha2 version of this struct, to rename the field to ref instead.

Comment on lines +43 to +45
/// Since git-sync v4.x.x this field is mapped to the flag `--period`.
#[serde(default = "GitSync::default_wait")]
pub wait: Duration,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: Another candidate for version v1alpha2 to rename the field to period.

Comment on lines +207 to +221
let internal_args = [
Some(("--repo".to_string(), git_sync.repo.to_owned())),
Some(("--ref".to_string(), git_sync.branch.to_owned())),
Some(("--depth".to_string(), git_sync.depth.to_string())),
Some((
"--period".to_string(),
format!("{}s", git_sync.wait.as_secs()),
)),
Some(("--link".to_string(), GIT_SYNC_LINK.to_string())),
Some(("--root".to_string(), GIT_SYNC_ROOT_DIR.to_string())),
one_time.then_some(("--one-time".to_string(), "true".to_string())),
]
.into_iter()
.flatten()
.collect::<BTreeMap<_, _>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Remove the Somes (and the no longer required flatten call) and also remove the explicit into_iter call.

Suggested change
let internal_args = [
Some(("--repo".to_string(), git_sync.repo.to_owned())),
Some(("--ref".to_string(), git_sync.branch.to_owned())),
Some(("--depth".to_string(), git_sync.depth.to_string())),
Some((
"--period".to_string(),
format!("{}s", git_sync.wait.as_secs()),
)),
Some(("--link".to_string(), GIT_SYNC_LINK.to_string())),
Some(("--root".to_string(), GIT_SYNC_ROOT_DIR.to_string())),
one_time.then_some(("--one-time".to_string(), "true".to_string())),
]
.into_iter()
.flatten()
.collect::<BTreeMap<_, _>>();
let mut internal_args = BTreeMap::from([
("--repo".to_string(), git_sync.repo.to_owned()),
("--ref".to_string(), git_sync.branch.to_owned()),
("--depth".to_string(), git_sync.depth.to_string()),
(
"--period".to_string(),
format!("{}s", git_sync.wait.as_secs()),
),
("--link".to_string(), GIT_SYNC_LINK.to_string()),
("--root".to_string(), GIT_SYNC_ROOT_DIR.to_string()),
]);
if one_time {
internal_args.insert("--one-time".into(), "true".into());
}

Comment on lines +223 to +228
let internal_git_config = [(
GIT_SYNC_SAFE_DIR_OPTION.to_string(),
GIT_SYNC_ROOT_DIR.to_string(),
)]
.into_iter()
.collect::<BTreeMap<_, _>>();
Copy link
Member

@Techassi Techassi May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Simplify this.

Suggested change
let internal_git_config = [(
GIT_SYNC_SAFE_DIR_OPTION.to_string(),
GIT_SYNC_ROOT_DIR.to_string(),
)]
.into_iter()
.collect::<BTreeMap<_, _>>();
let internal_git_config = BTreeMap::from([(
GIT_SYNC_SAFE_DIR_OPTION.to_owned(),
GIT_SYNC_ROOT_DIR.to_owned(),
)]);

// (https://github.com/stackabletech/airflow-operator/pull/381)
// used this condition to find Git configs. It is also used here
// for backwards-compatibility:
if key.to_lowercase().ends_with("-git-config") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This condition seems a little weird. As far as I can see, the only way to provide custom git configs in git-sync is to use --git-config. Why don't we instead use the following expression:

if key.to_lowercase() == "--git-config" {}

Independent of which approach we finally go with, the value we compare against should live in a constant.

Comment on lines +244 to +248
if internal_git_config.keys().any(|key| value.contains(key)) {
tracing::warn!("Config option {value:?} contains a value for {GIT_SYNC_SAFE_DIR_OPTION} that overrides
the value of this operator. Git-sync functionality will probably not work as expected!");
}
user_defined_git_configs.push(value.to_owned());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Do we want to allow that? This can potentially break what the operator wants to do. Basically user freedom vs opinionated approach.

Comment on lines +268 to +270
let mut args = internal_args;
args.extend(user_defined_args);
args.insert("--git-config".to_string(), format!("'{git_config}'"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Any particular reason why we don't operate on the internal_args directly? One of the above suggestions even makes them mut.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Development: In Review
Development

Successfully merging this pull request may close these issues.

3 participants