-
Notifications
You must be signed in to change notification settings - Fork 129
Add filter and foreach sub commands for bulk operations #3027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add filter and foreach sub commands for bulk operations #3027
Conversation
…n manifest and dir_name in integrations repo.
cmd/foreach.go
Outdated
|
|
||
| func executeCommand(args []string, path string) error { | ||
| // Look up the elastic-package binary in PATH | ||
| execPath, err := exec.LookPath("elastic-package") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this is a command run on top of other commands of this binary, could we target directly the commands by its functions, instead of looking for the path binary?
what if the binary is not updated or needs to run specific version? just having some thoughts here on how would be a better aproach without depending of having the binary installed as its own dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be possible to do something like this:
cmd := cmd.RootCmd()
cmd.SetArgs(args)
return cmd.Execute()We could also create our own command that is like cmd.RootCmd(), but where we only add the commands with AddCommand that we allow.
internal/packages/packages.go
Outdated
|
|
||
| // MustFindIntegrationRoot finds and returns the path to the root folder of a package from the working directory. | ||
| // It fails with an error if the package root can't be found. | ||
| func MustFindIntegrationRoot() (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all this code related to detect the repository root can be reused from https://github.com/vinit-chauhan/elastic-package/blob/caaff9abcc9abb60ec525e40eab441ffd99b7828/internal/files/repository.go#L16 wdyt?
There the "integrations root" is considered the git repository of the integrations repository.
internal/packages/packages.go
Outdated
| (m.Type == dataStreamTypeLogs || m.Type == dataStreamTypeMetrics || m.Type == dataStreamTypeSynthetics || m.Type == dataStreamTypeTraces), | ||
| nil | ||
| } | ||
| func isIntegrationRepo(path string) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, i think we can use the reference of the git repository instead of the go.mod file. @jsoriano wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
jsoriano
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! This will be helpful. Added some questions and suggestions.
internal/cobraext/flags.go
Outdated
| FilterIntegrationTypeFlagName = "integration-type" | ||
| FilterIntegrationTypeFlagDescription = "integration types to filter by (comma-separated values)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integration is a type of package, this should be probably called package type.
| FilterIntegrationTypeFlagName = "integration-type" | |
| FilterIntegrationTypeFlagDescription = "integration types to filter by (comma-separated values)" | |
| FilterPackageTypeFlagName = "package-type" | |
| FilterPackageTypeFlagDescription = "package types to filter by (comma-separated values)" |
internal/cobraext/flags.go
Outdated
| FilterPackagesFlagName = "package-name" | ||
| FilterPackagesFlagDescription = "package names to filter by (comma-separated values)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can call this flag directly packages, or package-path to avoid confusion between package paths and their names.
internal/cobraext/flags.go
Outdated
| FilterSpecVersionFlagDescription = "Package spec version to filter by (semver)" | ||
|
|
||
| ForeachPoolSizeFlagName = "parallel" | ||
| ForeachPoolSizeFlagDescription = "number of packages to execute in parallel (defaults to serial execution)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ForeachPoolSizeFlagDescription = "number of packages to execute in parallel (defaults to serial execution)" | |
| ForeachPoolSizeFlagDescription = "Number of subcommands to execute in parallel (defaults to serial execution)" |
internal/filter/packagename.go
Outdated
|
|
||
| func (f *PackageNameFlag) Matches(dirName string, manifest *packages.PackageManifest) bool { | ||
| for _, pattern := range f.patterns { | ||
| if pattern.Match(dirName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, here it is confusing to have a filter package name, that is actually matching paths.
internal/packages/packages.go
Outdated
|
|
||
| // ReadAllPackageManifests reads all the package manifests in the given root directory. | ||
| func ReadAllPackageManifests(root string) ([]PackageDirNameAndManifest, error) { | ||
| files, err := filepath.Glob(filepath.Join(root, "packages", "*", PackageManifestFile)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The packages directory is a convention in the integrations repository, but is not formalized. Actually it is intended that developers can place their packages wherever they want. For example in package-spec and elastic-package repositories there are packages under ./test/packages, and it would be nice to be able to use this tool there too 🙂
The idea of the --packages flag proposed in #2327 was to allow to customize this, even if the default is still ./packages.
In #2327 I also proposed an --auto flag, that was intended to look for packages in a repository, by looking for manifest files and so on. There is no need to implement it, just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - That makes sense. I overlooked the fact that we allow the integrations in places other than integration repo.
I'll update the code to allow different paths in --packages.
for the --auto; if no filter is provided, it will work return all packages and foreach would perform action on all of those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. Underscores in Go file or package names should be avoided when possible. Specially on file names where some suffixes may have meaning, such as _test.go, or _linux.go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, when I added _flag it felt weird to me as well. but I did it to make the flag files distinct. But I'll remove that.
cmd/foreach.go
Outdated
| Path: execPath, | ||
| Args: append([]string{execPath}, args...), | ||
| Dir: path, | ||
| Stdout: io.Discard, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why discarding stdout?
internal/filter/type.go
Outdated
| func (f *FilterFlagBase) Name() string { | ||
| return f.name | ||
| } | ||
|
|
||
| func (f *FilterFlagBase) Description() string { | ||
| return f.description | ||
| } | ||
|
|
||
| func (f *FilterFlagBase) Shorthand() string { | ||
| return f.shorthand | ||
| } | ||
|
|
||
| func (f *FilterFlagBase) DefaultValue() string { | ||
| return f.defaultValue | ||
| } | ||
|
|
||
| func (f *FilterFlagBase) Register(cmd *cobra.Command) { | ||
| cmd.Flags().StringP(f.Name(), f.Shorthand(), f.DefaultValue(), f.Description()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. No need to define all these functions as this is only used here, right?
| func (f *FilterFlagBase) Name() string { | |
| return f.name | |
| } | |
| func (f *FilterFlagBase) Description() string { | |
| return f.description | |
| } | |
| func (f *FilterFlagBase) Shorthand() string { | |
| return f.shorthand | |
| } | |
| func (f *FilterFlagBase) DefaultValue() string { | |
| return f.defaultValue | |
| } | |
| func (f *FilterFlagBase) Register(cmd *cobra.Command) { | |
| cmd.Flags().StringP(f.Name(), f.Shorthand(), f.DefaultValue(), f.Description()) | |
| } | |
| func (f *FilterFlagBase) Register(cmd *cobra.Command) { | |
| cmd.Flags().StringP(f.name, f.shorthand, f.defaultValue, f.description) | |
| } |
| // FilterFlag defines the basic interface for filter flags. | ||
| type FilterFlag interface { | ||
| Name() string | ||
| Description() string | ||
| Shorthand() string | ||
| DefaultValue() string | ||
|
|
||
| Register(cmd *cobra.Command) | ||
| IsApplied() bool | ||
| } | ||
|
|
||
| // Filter extends FilterFlag with filtering capabilities. | ||
| // It defines the interface for filtering packages based on specific criteria. | ||
| type Filter interface { | ||
| FilterFlag | ||
|
|
||
| Parse(cmd *cobra.Command) error | ||
| Validate() error | ||
| ApplyTo(pkgs []packages.PackageDirNameAndManifest) ([]packages.PackageDirNameAndManifest, error) | ||
| // Matches checks if a package matches the filter criteria. | ||
| // dirName is the directory name of the package in package root. | ||
| Matches(dirName string, manifest *packages.PackageManifest) bool | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. Is FilterFlag interface ever used? I think these interfaces can be reduced to one, specially if we remove the methods to access attributes.
| // FilterFlag defines the basic interface for filter flags. | |
| type FilterFlag interface { | |
| Name() string | |
| Description() string | |
| Shorthand() string | |
| DefaultValue() string | |
| Register(cmd *cobra.Command) | |
| IsApplied() bool | |
| } | |
| // Filter extends FilterFlag with filtering capabilities. | |
| // It defines the interface for filtering packages based on specific criteria. | |
| type Filter interface { | |
| FilterFlag | |
| Parse(cmd *cobra.Command) error | |
| Validate() error | |
| ApplyTo(pkgs []packages.PackageDirNameAndManifest) ([]packages.PackageDirNameAndManifest, error) | |
| // Matches checks if a package matches the filter criteria. | |
| // dirName is the directory name of the package in package root. | |
| Matches(dirName string, manifest *packages.PackageManifest) bool | |
| } | |
| // Filter defines the interface for filtering packages based on specific criteria. | |
| type Filter interface { | |
| Register(cmd *cobra.Command) | |
| IsApplied() bool | |
| Parse(cmd *cobra.Command) error | |
| Validate() error | |
| ApplyTo(pkgs []packages.PackageDirNameAndManifest) ([]packages.PackageDirNameAndManifest, error) | |
| // Matches checks if a package matches the filter criteria. | |
| // dirName is the directory name of the package in package root. | |
| Matches(dirName string, manifest *packages.PackageManifest) bool | |
| } |
- update --packages to use only package name; added another filter to filter by package dirs. - Added auto discovery of package for configurable depth. - Added flag to exclude dirs from filter process.
|
are you going to pull this out of draft? |
61bba87 to
689eb72
Compare
|
Removed code for parallel execution from the PR for now. |
|
It could be nice to add some additional output formats to the filter command. For example, one result per line output would make working with shell scripts easier. But, it's also easy to convert the json output to do this with jq, so adding multiple formats might not be critical to add. |
internal/packages/packages.go
Outdated
| // Validate it's a package manifest | ||
| ok, err := isPackageManifest(path) | ||
| if err != nil { | ||
| // Log the error but continue searching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't actually log the error here. You should either remove the comment or add logging
jrmolin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a startling lack of testing, but it looks reasonable otherwise
| FilterOutputFlagShorthand = "o" | ||
|
|
||
| FilterPackageDirNameFlagName = "package-dirs" | ||
| FilterPackageDirNameFlagDescription = "package directories to filter by (comma-separated values)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this flag expected to work? If I use --package-dirs ./test/packages/parallel in the elastic-package repository it doesn't find any package.
$ elastic-package filter --package-dirs ./test/packages/parallel
2025/11/12 21:20:31 INFO Found 0 matching package(s)
null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
package-dirs would filter packages based on the name of the package's directory. This is used to search without needing to know the package name in the manifest file.
example:
go run main.go -C ../integrations filter --package-dirs sql_input -o pkgname
2025/11/13 09:29:56 INFO Found 1 matching package(s)
["sql"]
if you want to filter all packages in one directory you can use the following command.
❯ go run main.go -C test/packages/parallel filter
2025/11/13 09:11:51 INFO Found 19 matching package(s)
["apache","apache_basic_license","auditd_manager","auth0_logsdb","aws","awsfirehose","custom_entrypoint","httpcheck","mongodb","nginx","nginx_multiple_services","oracle","otel_http_server","sql_input","system","terraform_local","ti_anomali","ti_anomali_logsdb","ti_anomali_template"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok. So it is not possible to search in multiple directories. I guess this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to add the example to the description so it goes to the docs?
| errors := multierror.Error{} | ||
|
|
||
| for _, pkg := range filtered { | ||
| rootCmd := cmd.Root() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could define another root command with the specific commands that are supported and use it here, then we could have different sets of subcommands, and maybe we could even plug it directly as subcommands of foreach.
This could be left for a future refactor too.
Something like this:
import (
"github.com/spf13/cobra"
"github.com/elastic/elastic-package/internal/cobraext"
)
var forEachCommands = []*cobraext.Command{
setupBuildCommand(),
setupCheckCommand(),
setupCleanCommand(),
setupFormatCommand(),
setupInstallCommand(),
setupLintCommand(),
setupTestCommand(),
setupUninstallCommand(),
}
// ForEachRootCmd creates and returns root cmd for elastic-package
func ForEachRootCmd() *cobra.Command {
forEachCmd := &cobra.Command{
SilenceUsage: true,
PersistentPreRunE: func(cmd *cobra.Command, args []string) error {
return cobraext.ComposeCommandActions(cmd, args,
processPersistentFlags,
checkVersionUpdate,
)
},
}
forEachCmd.PersistentFlags().CountP(cobraext.VerboseFlagName, cobraext.VerboseFlagShorthand, cobraext.VerboseFlagDescription)
forEachCmd.PersistentFlags().StringP(cobraext.ChangeDirectoryFlagName, cobraext.ChangeDirectoryFlagShorthand, "", cobraext.ChangeDirectoryFlagDescription)
for _, cmd := range forEachCommands {
forEachCmd.AddCommand(cmd.Command)
}
return forEachCmd
}That could be used here like this:
| rootCmd := cmd.Root() | |
| rootCmd := ForEachRootCmd() | |
| rootCmd.SetContext(cmd.Context()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something that we need to do to add parallel execution. That's why I left it out of this PR, and would like to revisit it while I work on the other PR.
With that said, we do have a list of allowed sub-commands. Here
Let me know if I missed any command. - I'll also add changelog command in allow list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can leave this for a future refactor.
I'll also add
changelogcommand in allow list.
Yes please.
| FilterDepthFlagName = "depth" | ||
| FilterDepthFlagDescription = "maximum depth to search for packages" | ||
| FilterDepthFlagDefault = 2 | ||
| FilterDepthFlagShorthand = "d" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need to filter out built packages by default, otherwise they are discovered, and it is confusing to see them listed as their versions:
$ elastic-package filter -d 10 --packages apache
2025/11/12 21:31:32 INFO Found 3 matching package(s)
["999.999.999","apache","apache_basic_license"]
999.999.999 there is a built package under ./build/packages/apache/999.999.999/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came across the same issue, so I've added a --exclude-dirs flag that can be use to explicitly exclude directories.
Also, .git is excluded by default.
❯ go run main.go filter -d 10 --packages apache --exclude-dirs build
2025/11/13 09:24:18 INFO Found 2 matching package(s)
["apache","apache_basic_license"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe build could be also excluded by default as I don't think it is ever intended to search into it.
I would also prefer to have this format by default, and make json an optional alternative format. |
- add build to exclude list - default output to nsv. also allow json and yaml
|
Updated the code to support multiple output formats. - Newline separated list ( default )
- JSON
- YAML |
💚 Build Succeeded
History
|
Elastic-package has been extensively used for integrations related tasks. However, most of the commands now are targeted to be run on one package at a time. We don’t have an option if we want to repeat a certain operation across multiple integrations.
This pull request adds two subcommands to the
elastic-packageto allow bulk operations from the elastic-package.filterforeachNote: Both commands are expected to be run from the integration repository.
Filter subcommand
Filter command adds the ability to filter and return a list of integrations based on specified criteria..
Available Filters:
integration,input)You can chain multiple filters and each filter can have multiple comma-separated values
Matching:
All filters must match.
At least one of the values match
The current elastic-package and spec does not enforce the same package_name and directory name in the repo. Which leads to some integrations having different package_name and directory name.
The filter command by default returns the name of the directory in the integration repo. However, it also provides a flag
--output-package-name/-pto make it return the package name.elastic-package filter --input tcp,udp --code-owner elastic/integration-experience --package-name cisco_*Foreach subcommand
The foreach command leverages the filter registry. Therefore all the flags available in the filter are directly available to foreach commands without any code changes.
Additionally, foreach has 1 flag
--parallelwhich allows the user to run commands parallelly using worker pool.The
elastic-packagecommand you want to run goes after -- with all of its flags.elastic-package foreach --input tcp,udp --code-owner elastic/integration-experience --parallel 5 -- test pipeline –generateFile changes:
internal/packages/packages.go: Added function to find the integrations repo root dir and read all manifests.cmd/filter.go: Filter command implementationcmd/foreach.go: Foreach command implementationinternal/filter/*: Filter interface and implementation for each filter flag.Related Issues
AI Tools used
CursorWithClaude-4.5-Sonnet