-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Improve prompt #179
chore: Improve prompt #179
Conversation
@gentlementlegen here is the new fix. |
In my opinion the system message should contain only the instruction and the comments should be sent as the query. Right now there is no query being sent as
Please checkout Groundtruths pass in the repo deps as a stringified JSON object as the query, with the system message being the instruction and example. We should probs do the same here with comments.
|
@jaykayudo, this task has been idle for a while. Please provide an update. |
Prompt
Issue: In the v1 of the Ubiquibot, when a result gets evaluated, a recap is posted to the related issue with a summary and details of the rewards as well as the link to the rewards themselves Example: ubiquity/cloudflare-deploy-action#9 (comment) The same logic should be applied in the v2 version by creating a new Module responsible to post that comment. This module will receive a similar input than the one mentioned here The module should be:
All comments: interface PluginOutput {
comment: string; // html comment
rewards: Rewards; // { "whilefoo": "500", "token": "0x0" } etc
}
```"
2033404518 - gentlementlegen: "This needs https://github.com/ubiquibot/conversation-rewards/pull/7 to be merged first. Also probably needs https://github.com/ubiquibot/permit-generation/pull/2 to be able to generate the permits properly."
2033488255 - 0x4007: "> This needs https://github.com/ubiquibot/conversation-rewards/pull/7 to be merged first. Also probably needs https://github.com/ubiquibot/permit-generation/pull/2 to be able to generate the permits properly.
I think you should fork from and overtake that second pull due to us being behind schedule "
2035427134 - whilefoo: "> @whilefoo rfc on how we can deal with comment outputs. Perhaps we can have a standard recognized property on the output interface? Then the kernel can decide whether to pass it around or something.
>
> ```ts
> interface PluginOutput {
> comment: string; // html comment
> rewards: Rewards; // { "whilefoo": "500", "token": "0x0" } etc
> }
> ```
there are a couple of options:
1. we let the conversation-rewards plugin generate and post the comment
2. we put comment as output and then another module is responsible for posting it or let conversation-rewards generate rewards and permit-generation generate permits and a third module that uses output from previous plugins to make a comment and post it
3. we let the conversation-rewards plugin generate the comment and pass it as a standard property like you suggested.
In theory 2. option sounds good to separate concerns but it's another plugin which means another call to github actions thus more latency, so for the sake of speed it'd go with option 1 or 3, but going with these 2 options would mean there will 1 comment for rewards summary and 1 comment for permits.
I'm not sure if option 3 is any better than option 1 because the plugin already has a token that has permissions to post comments so passing it to the kernel doesn't make much difference.
"
2036174312 - gentlementlegen: "To me 1 is the most straightforward to do for few reasons:
- the comment reward plugin has all the needed data already
- it can import the https://github.com/ubiquibot/permit-generation to generate permits itself
- if it is done this way it can be used as a complete standalone without the kernel
3 might make more sense in terms of architecture however. In such case the kernel should pass down results. It is more of an architecture question. Although, if we ever have other plugins in the flow that have influence on the total incentives, it would make sense to go through the kernel to aggregate the total result."
2036355445 - 0x4007: ">
> there are a couple of options:
>
> 1. we let the conversation-rewards plugin generate and post the comment
I think the most pure architecture would be that plugins can NOT inherit (write) authentication (only read if possible) of the kernel. As a consequence, no plugin should be able to post directly any issue. Ideally it should only be the kernel with a direct interface to issues. Plugins should just output comment HTML and the kernel can post it all in a single comment at the end of the webhook event invocation chain.
> 2. we put comment as output and then another module is responsible for posting it or let conversation-rewards generate rewards and permit-generation generate permits and a third module that uses output from previous plugins to make a comment and post it
Could be interesting to have a dedicated plugin to handle commenting, but because this seems like such an essential capability, I would more carefully consider the pros/cons of including this within the kernel.
- On one hand, I really like the idea of making the kernel as pure and lean as possible
- On the other hand there are practical considerations that if every plugin standard output interface includes `comment:` and then we don't have the `comment` capability available (the comment plugin is not included in the config) then why is this a standard output property required on every plugin?
> 3. we let the conversation-rewards plugin generate the comment and pass it as a standard property like you suggested.
>
> In theory 2. option sounds good to separate concerns but it's another plugin which means another call to github actions thus more latency, so for the sake of speed it'd go with option 1 or 3, but going with these 2 options would mean there will 1 comment for rewards summary and 1 comment for permits. I'm not sure if option 3 is any better than option 1 because the plugin already has a token that has permissions to post comments so passing it to the kernel doesn't make much difference.
I'm not concerned about latency now. Besides we can port any plugin to Cloudflare Workers (or combine several inside a single Worker) when we are ready to fix latency issues. To be honest though, except for setting labels, I have no problem with latency (like with permit generation) so there's no need to overengineer yet. I also love the GitHub Actions logs being available for easy debugging, and the fact that they are super generous with the compute.
Given the entire conversation and all the considerations, I definitely think we should do option 3.
This separates concerns the best, which will allow our plugin ecosystem to grow the fastest. "
2036367126 - gentlementlegen: "I think each plugin should output JSON not html as it is not reliable to parse nor manipulate and requires `window` instance to be instantiated which is annoying on `node` based projects.
Having a plugin handling commenting seems quite weird as commenting is done by calling Octokit and the REST api which is already a library by itself, so no need to encapsulate it within another one to do the same thing.
My view on this, is to finalize https://github.com/ubiquibot/permit-generation/issues/5 to import it withing the `conversation-reward` that will post the comment itself as well, otherwise the architecture will be quite convoluted doing ping pong with everything."
2036370459 - 0x4007: "> I think each plugin should output JSON not html as it is not reliable to parse nor manipulate
I know JSON makes things more complicated than it needs to be with serialization/transport. Then we have problems like this https://github.com/ubiquibot/conversation-rewards/issues/18. If the comment property is ONLY meant to be used for GitHub comments, then transporting HTML only is the most sensible. Besides, there's no other metadata thats necessary within the `comment` property, which is the main point of sending a JSON object (it has multiple properties.)
As a somewhat side note: there should also not be any variables inside of the HTML. We could look it as like server-side-rendering (or in this case, plugin-side-rendering) handle producing the final HTML output and then output it as a single string of html entities.
> requires window instance to be instantiated which is annoying on node based projects.
We are currently using `mdast` in `@ubiquibot/comment-incentives` to generate virtual DOMs. No `window` needed.
> My view on this, is to finalize https://github.com/ubiquibot/permit-generation/issues/5 to import it withing the conversation-reward that will post the comment itself as well, otherwise the architecture will be quite convoluted doing ping pong with everything.
If we agree that this is considered as technical debt, and that we are accruing this so that we can get back on schedule, ok."
2036385985 - gentlementlegen: "If you want to manipulate and convey data, HTML really is not made for this. If you want something formatted similarly but made for data we can use XML format.
The new comment reward actually does instantiate a DOM through [JSDOM](https://github.com/ubiquibot/conversation-rewards/blob/ba434761281446a23566cd02c68bd3b0e79d4eb1/src/parser/formatting-evaluator-module.ts#L80) to make things way simpler instead of using Regex everywhere which is highly unreliable. But there it makes sense because we are parsing comments from an HTML page content.
Biggest advantage from this is to have the comment reward fully standalone, while easy to integrate with the kernel.
If we do something that handles the comment it means each and every module has to send it there and that module should understand every different content / format we send which would be way easier if the module itself handled its own comments, formatting wise."
2036393020 - 0x4007: "> If you want to manipulate and convey data, HTML really is not made for this.
Going back to my "plugin-side-rendering" mention, the data manipulation happens inside of the logic of the plugin. Then when the plugin is finally done with all of its compute, it emits a single string in the `comment` output property. This string is the final, rendered HTML.
The `comment` output is not intended to be consumed by other plugins. In most cases, the kernel will concatenate all the `comment` outputs from every plugin and post a single comment at the end. Plugins will consume the `metadata` property output which will include raw values to work with. I forget where I originally proposed this, but imagine something like:
```typescript
type HtmlString = string;
type RequestedPermits = { username: string; amount: string; token: string; }[]
interface PluginOutput {
metadata: Record<string, any>;
comment: HtmlString;
rewards: RequestedPermits;
}
```"
2036411811 - gentlementlegen: "But then how do we consider the formatting of that output?
Practical case: we want to post a comment when a user queries a wallet. That comment is 'user name': 'wallet 0x0'
Kernel calls the comment plugin, saying that it wants a comment to be posted. Should the Kernel send the rendering it wants, should the comment plugin transform the data to HTML?
Then, comment-reward wants to post the results. Should ask the Kernel to call the comment plugin, but then formatting is different. Should the Kernel notify the comment plugin that it wants a different output formatting? Should the Kernel compute beforehand the HTML and send it to the comment plugin?
Might be something I don't grasp there. Because I do understand your use case but it seems to be very deterministic on what is the purpose of the plugins which kinda defeats the purpose of having plugins, looks more like a monolithic approach to me"
2036433646 - 0x4007: "> But then how do we consider the formatting of that output?
The proposed `comment` output is intended for ease of composability for MOST situations, basically that the output from one plugin would just be appended to the final rendered comment. For simple plugin configurations (plugins not modifying the output results of others) this seems like straightforward architecture.
However, as we know, there will be situations where a subsequent plugin will consider the results of a previous plugin, which means it would need to change the comment that is rendered.
In this situation the subsequent plugin should clobber the output of the previous plugin. It is now clear to be that this will be a new challenge to express to the kernel that it should ignore the comment output of a previous plugin. I suppose it would be straightforward in the metadata using the plugin ID i.e. `{ "silences": ["ubiquibot/conversation-rewards"] }`.
> Kernel calls the comment plugin,
I think that comments should be handled from within the kernel. There should not be a separate comment plugin. Read my explanation [here](https://github.com/ubiquibot/conversation-rewards/issues/5#issuecomment-2036355445).
### Inputs
For completeness of my [previous comment](https://github.com/ubiquibot/conversation-rewards/issues/5#issuecomment-2036393020):
```typescript
type PluginId = string; // i.e. `ubiquibot/conversation-rewards` `ubiquibot/assistive-pricing`
interface PluginInput {
metadata: Record<PluginId, any>;
context: GitHubEventContext;
} Notice that we should not pass in the HTML of other plugins. Instead, just the metadata ("computed") values from the previous plugins." Comments to evaluate: I understand your concern and I would need to put more thought into composability. Maybe the I need help to think through how any partner in the future can create new plugins that modify
I don't like the idea of having a single monolithic rewards generation plugin that wraps all the ways possible to compute rewards (i.e. time estimation, comment rewards, review rewards etc)" - I see, so you're suggesting that we must annotate each comment as well in order to properly handle the scoring at the end? Off hand I think there's:
I suppose we need a better naming convention for the pull related ones. They are considered as separate entities according to the GitHub api. They require different methods to obtain both types. " -
I think this shall cover all cases." - Also I think you forgot about the "review comments"" - Instructions:
Notes:
Response(after 10 consecutive requests)
|
@sshivaditya2019 @gentlementlegen please decide. I think your testing module for this could be useful? I think it mostly looks fine at this point. Would be great to get some others to agree because it's a lot to wrangle. |
@0x4007 I do not want to merge this without testing, it broke all the generation last time we did so. I want to see a real run first. @jaykayudo Why are the comments not sent on a JSON format anymore? It avoided any ambiguity on the comment content. |
@jaykayudo, could you share some results for less obvious examples where the relevance is expected to be high? For instance, this Issue |
@0x4007 suggested this new format. |
@sshivaditya2019 Here is the result from the issue you provided. the last 7 comments was evaluated Prompt
Issue: The bot shows a warning message on tasks that were opened recently, as seen here. It should only display this warning above a certain threshold, which comes from the configuration. A possible cause would be that value missing in the current configuration. If that is the case, the default threshold should be set, and definitely above 0 days. Tasks to be carried out:
All comments: Maybe there is a difference on mobile devices if you are using one?" ubq-testing/command-start-stop#5 (comment)
" 1
2
3
---" If its only a problem on mobile then perhaps Yeah it's only mobile that
I agree it looks best but no it is somewhat different although personally I think Sounds like option 3 is preferred and that's how the code is right now so no changes needed in this regard" I also suggest that we change the configuration format to take a string representing a duration and not a number which is hard to understand and error prone."
@gentlementlegen you want to add it into the spec/title here and I'll handle both?" Comments to evaluate: I'm certain that we landed on sticking with Also I should be getting img credit. Seems like there's problems with quantitative analysis as well @gentlementlegen " - After reading the logs, it is possible that the allocated Instructions:
Notes:
Response{
"2285387841": 0.5,
"2285389471": 0.6,
"2308672581": 0.1,
"2308673448": 0.3,
"2308752680": 0.2,
"2308755339": 0.4,
"2243997569": 0.7
} |
@0x4007 Why removing the |
@jaykayudo, this task has been idle for a while. Please provide an update. |
@gentlementlegen @0x4007 does this PR meet the requirements to resolve the issue or is there more needed to be done? |
@jaykayudo please resolve the conflict. Also would like the reasoning for removing the JSON output @0x4007 |
@gentlementlegen I have resolved the conflict and from the previous PR, this was the reason for removing this JSON comments. |
${allCommentsMap.join("\n")} | ||
|
||
Comments to evaluate: | ||
${commentsMap.join("\n")} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern about comments to evaluate is here. A line break seems weak to separate comments, which can lead to the result not having the same length as the provided comments, you never had such issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line break basically has no effect in separating the comments. The LLM is intelligent enough to tell the start of the comment and the end of the comment. i once thought it might be an issue as well but surprisingly after testing a lot of prompts, there was no issue at all. you can look at the prompts posted above for better clarification. if you want it to be reversed to JSON notation, no problem.
JSON is extremely redundant compared to plaintext/markdown. For humans and LLMs JSON is not the most efficient way to express (most) ideas. |
Then I suppose this can be merged and tested, and reverted like last time if anything goes wrong. @jaykayudo please merge the |
Done |
17b0cdf
into
ubiquity-os-marketplace:development
Resolves #147