Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: Support kafka trigger type #795

Merged
merged 7 commits into from
Oct 16, 2024
Merged

Conversation

midigofrank
Copy link
Contributor

@midigofrank midigofrank commented Oct 15, 2024

Short Description

This pull request adds kafka trigger type.

Fixes #796

Implementation Details

  • The structure maintains the structure as it is in lightning
  • There is a validation that uses regex to ensure that the host is in the form hostname:number

QA Notes

List any considerations/cases/advice for testing/QA here.

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

  • Code generation (copilot but not intellisense)
  • Learning or fact checking
  • Strategy / design
  • Optimisation / refactoring
  • Translation / spellchecking / doc gen
  • Other
  • I have not used AI

You can read more details in our Responsible AI Policy

@midigofrank midigofrank self-assigned this Oct 15, 2024
@midigofrank midigofrank marked this pull request as ready for review October 15, 2024 09:34
Copy link
Collaborator

@josephjclark josephjclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @midigofrank, I'm recommending a couple of refactors because some of the code is getting stretched very thin here.

Is there anyway we can add a test on this? We should be able to at least add unit tests on transformSpecKafkaHost

@@ -130,6 +149,16 @@ function mergeTriggers(
...(specTrigger.type === 'cron'
? { cron_expression: specTrigger.cron_expression }
: {}),
...(specTrigger.type === 'kafka'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting way to complicated and I'm going to insist on a refactor here :(

I've just scribbled this and haven't tested it, but something along these lines would be a lot more readable and scalable:

 if (specTrigger && !stateTrigger) {
    const trigger = {
      id: crypto.randomUUID(),
      ...pickKeys(specTrigger, ['type', 'enabled']),
    };
    
    if (specTrigger.type === 'cron') {
      trigger.cron_expression = specTrigger.cron_expression
    } else if (specTrigger.type === 'kafka') {
      trigger.kafka_configuration = {
        ...specTrigger.kafka_configuration,
        hosts: transformSpecKafkaHost(
          specTrigger.kafka_configuration?.hosts ?? []
        ),
      },
    }

    return [triggerKey, trigger]
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah I agree.

@@ -149,6 +178,16 @@ function mergeTriggers(
...(specTrigger!.type === 'cron'
? { cron_expression: specTrigger!.cron_expression }
: {}),
...(specTrigger!.type === 'kafka'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same again - this block needs to be refactored for readability

kafka_configuration: {
...specTrigger!.kafka_configuration,
hosts: transformSpecKafkaHost(
specTrigger!.kafka_configuration?.hosts ?? []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would default the argument of transformSpecKafkaHost inside the function declaration, rather than passing an empty array here:

function transformSpecKafkaHost(hosts: string[] = []): StateKafkaHost[] { ... }

@@ -30,6 +31,24 @@ function stringifyJobBody(body: SpecJobBody): string {
}
}

function transformSpecKafkaHost(hosts: string[]): StateKafkaHost[] {
function isValidHost(value: string): boolean {
const regex = /^([a-zA-Z0-9.-]+):(\d+)$/;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by . and - at the end of the host regex here?

. means any character, so it kind of means the first pad means "match any character" and renders the rest obsolete.

I'm not sure why we need to support -

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey was this an AI generated regex by any chance?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pattern I see for this sort of thing is to use the URL object to parse a url string, and then test for host and port key.

I'm not convinced this approach is better than regex, but it's definitely better than a bad regex 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh and maybe a dumb question - I'm nitpicking and anyway I don't know much about kafka... but does it definitely need a port in the URL? It can't run off a default port?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is AI generated 😅 . I did a couple of tests and it was able to capture all the bad cases I could think of.

I tried using the URL object but it proved troublesome because the protocol is missing from the required input. new URL('localhost:9092') parses localhost as the protocol. And yes, you MUST specify the port when connecting to a kafka cluster

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if I can simplify this. Maybe we don't even need a regex, the intention is to make sure that there is a hostname:port_number in the string. AI decided to take it a notch higher by trying to ensure that the hostname is a valid ip address

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, then probably the AI wanted to escape the .

Good practice to turn those tests into unit tests - if only against the regex itself - so that readers can see what sort of patterns work (and maybe don't work!)

I think a simple regex or new URL is totally fine, so long as there are a few unit tests. Don't need to spend too long on it!

@josephjclark josephjclark merged commit 591bcc8 into main Oct 16, 2024
6 checks passed
@josephjclark josephjclark deleted the cli-support-kafka-triggers branch October 16, 2024 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

CLI: Support kafka trigger type
2 participants