Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema: Support more checkout data in v4.5 #87

Merged
merged 2 commits into from
Aug 29, 2024
Merged

Conversation

spbnick
Copy link
Collaborator

@spbnick spbnick commented Aug 21, 2024

Add support for three more checkout fields: git_commit_tags, git_commit_message, and git_repository_branch_tip.

The git_commit_tags is an array of strings representing annotated tags pointing directly at the commit being checked out, as seen in the source repository. I.e. the output of git tag --points-at <commit>. Set to an empty array, if the commit has no tags.

The git_commit_message is intended to hold the complete message of the commit being checked out, both subject and body. I.e. the output of git show -s --format=%B. We're putting the subject and the body together, as it's quite easy to extract the subject in SQL, while full-text search is easier and more efficient to do over a single column.

Finally, the git_repository_branch_tip is a boolean flag, which should be set to true, when the commit being checked out is at the tip of the branch at the moment of the checkout (as specified in start_time). Essentially, if you're always testing only the tip of the branch, you can set this to true unconditionally. This flag would let us extract the checkouts which represented the branch state over time, and produce a rough history of branch changes, which we can then use for (regression) analysis and graphs, in lieu of actual commit graph walking.

Add support for three more checkout fields: `git_commit_tags`,
`git_commit_message`, and `git_repository_branch_tip`.

The `git_commit_tags` is an array of strings representing annotated tags
pointing directly at the commit being checked out, as seen in the
source repository. I.e. the output of `git tag --points-at <commit>`.
Set to an empty array, if the commit has no tags.

The `git_commit_message` is intended to hold the complete message of the
commit being checked out, *both* subject and body. I.e. the output of
`git show -s --format=%B`. We're putting the subject and the body
together, as it's quite easy to extract the subject in SQL, while
full-text search is easier and more efficient to do over a single
column.

Finally, the `git_repository_branch_tip` is a boolean flag, which should
be set to `true`, when the commit being checked out is at the tip of the
branch at the moment of the checkout (as specified in `start_time`).
Essentially, if you're always testing only the tip of the branch, you
can set this to `true` unconditionally. This flag would let us extract
the checkouts which represented the branch state over time, and produce
a rough history of branch changes, which we can then use for
(regression) analysis and graphs, in lieu of actual commit graph
walking.
@spbnick
Copy link
Collaborator Author

spbnick commented Aug 21, 2024

Do not merge until the schema changes are ratified by submitters.

@spbnick
Copy link
Collaborator Author

spbnick commented Aug 29, 2024

No objections received for this schema change.

@spbnick spbnick merged commit 21ddf85 into main Aug 29, 2024
5 checks passed
@spbnick spbnick deleted the support_more_checkout_data branch August 29, 2024 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant