-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add commit message linter #12
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, will be interesting to try this out. Few thoughts/suggestions but otherwise looks good to me.
Role: openai.ChatMessageRoleSystem, | ||
Content: "Here is the commit message to lint:\n" + commitMessage, | ||
}) | ||
return resp, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we include past commit titles here as well, like in BuildPrompt
, for reference of the style? Ideally a solid set of rules is probably better than what was written in the past, so this is just a thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. Theoretically, the style guide rules should be sufficient. I skipped them to significantly reduce the number of $ tokens in the prompt.
Do you think we should experiment with previous commit messages?
Looking at the example output you posted, I flagged few things:
|
What you're saying is accurate, but it seems to be a matter of visualization. For instance:
The most accurate answer from Chat would be "N/A", but currently we default to ✅
Ha! Do you suggest we should introduce |
I agree that N/A is better, but would also like to point out that as per conventional commits:
It's not correct to say either capitalize or don't without having a reference to other commits. Not sure if or how we should address this though but may in fact be worth making it N/A as the output is just confusing.
Heh, no let's not. Perhaps we can just add to the prompt for linting:
|
@mafredri I rephrased prompts to verify if linting rules are applicable. This is the result running for ../aicommit/bin/aicommit --lint "fix: make GetWorkspacesEligibleForTransition return even less false positives" -m gpt-4o
❌ Limit the subject line to 50 characters.
✅ Use the imperative mood in the subject line.
✅ Capitalize the subject line such as "Fix Issue 886" and don't end it with a period.
✅ The subject line should summarize the main change concisely.
🤫 Only include a body if absolutely necessary for complex changes.
🤫 If a body is needed, separate it from the subject with a blank line.
🤫 Wrap the body at 72 characters.
🤫 In the body, explain the why, not the what (the diff shows the what).
🤫 Use bullet points in the body only for truly distinct changes.
🤫 Be extremely concise. Assume the reader can understand the diff.
🤫 Never repeat information between the subject and body.
🤫 Do not repeat commit messages from previous commits.
✅ Prioritize clarity and brevity over completeness.
✅ Adhere to the repository's commit style if it exists.
suggestion: fix: optimize GetWorkspacesEligibleForTransition accuracy The biggest concern is still response stability, but I'm afraid this is something we have to deal with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvement 👍🏻. I second your worry about stable output, and I also worry about the amount of rules we're applying here. I think the only way this can work to a satisfactory degree is to either include commit context/history when linting or to explicitly mark which rules apply to linting and which to auto commits. Wdyt?
prompt.go
Outdated
Content: strings.Join([]string{ | ||
"You are `aicommit`, a tool designed to lint commit messages and generate a detailed linting report.", | ||
"You are operating in pull request (PR) title linting mode.", | ||
"In this mode, linting rules for commit subjects, bodies, or bullet points are not applicable.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it makes too many assumptions about the actual style-guide. It's based on this repository style guide but others may have completely different rules. Maybe we need a different way to signify what rules can be used for title linting?
Also, this rule is still applicable but disabled by this:
If a body is needed, separate it from the subject with a blank line.
If people need to tweak rules based on mode, we could allow a syntax like:
* Every commit title must contain an emoji [commit, lint]
* Title must reference changes [commit]
* EVERYTHING MUST BE IN CAPS (this rule applies to all modes)
* everything must be in lower case (this rule also applies to all modes?) [all]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either include commit context/history when linting or to explicitly mark which rules apply to linting and which to auto commits.
Speaking of the first option (commit history), how will it affect the lining process? Should the tool use these commits to inform the linter that these messages are actually compliant with repo style guide?
Maybe we need a different way to signify what rules can be used for title linting?
I can try implementing this approach as well. Honestly, I can't predict what will be the outcome, it is a magical black box :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we're including Git history in the prompt now. It is hard to say whether it improved the overall linting process. Suggestions are usually fine, but linting is not super stable.
Can we prepare COMMITS.md
for coder/websocket and validate aicommits against it? We will need the style guide anyway. I can arrange this but I'm curious what you think, @mafredri.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks for trying that out. The stability seems like the biggest issue, and it means the linter can't be used for enforcement, unfortunately.
I tried copying in the conventional commits specification into coder/coder COMMITS.md and linting the sample title, but it keeps modifying the rules between invocations and also enabling/disabling rules arbitrarily as non-applicable. :/
It's even failing the first rule occasionally even though it starts with fix: ...
😔.
Here's also an example of a rule it decided to simplify on a whim (it did this for multiple rules):
✅ Rule 4: A scope MAY be provided after a type. A scope MUST consist of a noun describing a section of the codebase surrounded by parenthesis, e.g., fix(parser):
vs
✅ Rule 4: A scope MAY be provided after a type.
This is an interesting experiment, but I wonder if it would be most sensible to lint coder/websocket PR titles with a more traditional approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting experiment, but I wonder if it would be most sensible to lint coder/websocket PR titles with a more traditional approach.
I agree, it eliminates aicommit
from linting competition, at least today. Maybe there is a secret prompt than can ensure response stability, but so far I haven't discovered it yet. I think I'm going to pause the experiment now.
Fixes: #11
For reviewers: If you want to play with the tool you need to setup OpenAI key, and if you need one please let me know.
This PR introduces a commit message (or PR titles) linter according to repository style guide.
The biggest challenge is ensuring that the results are consistent. AI often draws incorrect conclusions and tends to struggle with maintaining precise line lengths. We might need to revise our COMMITS.md guidelines: instead of saying limit line length to 50 characters, we could try keep commit messages concise.
In a follow-up PR, we can introduce a custom output format, e.g. JSON struct.
Troubleshooting: