Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aichat: code formatting #21342

Merged
merged 1 commit into from
Jan 17, 2024
Merged

aichat: code formatting #21342

merged 1 commit into from
Jan 17, 2024

Conversation

nullhook
Copy link
Contributor

@nullhook nullhook commented Dec 12, 2023

Resolves brave/brave-browser#33668

this PR adds formatted code UI rendering. it uses a lightweight component where syntax highlighted is limited for selected languages only.

production size:

790.62 KB  chat_ui.bundle.js // without highlighter
792.41 KB chat_ui.bundle.js // with lazy loaded highlighter
910.92 KB  chat_ui.bundle.js // with highlighter

dev size:

5.4 MB  chat_ui.bundle.js // without highlighter
5.7 MB  chat_ui.bundle.js // with highlighter
image

Submitter Checklist:

  • I confirm that no security/privacy review is needed and no other type of reviews are needed, or that I have requested them
  • There is a ticket for my issue
  • Used Github auto-closing keywords in the PR description above
  • Wrote a good PR/commit description
  • Squashed any review feedback or "fixup" commits before merge, so that history is a record of what happened in the repo, not your PR
  • Added appropriate labels (QA/Yes or QA/No; release-notes/include or release-notes/exclude; OS/...) to the associated issue
  • Checked the PR locally:
    • npm run test -- brave_browser_tests, npm run test -- brave_unit_tests wiki
    • npm run lint, npm run presubmit wiki, npm run gn_check, npm run tslint
  • Ran git rebase master (if needed)

Reviewer Checklist:

  • A security review is not needed, or a link to one is included in the PR description
  • New files have MPL-2.0 license header
  • Adequate test coverage exists to prevent regressions
  • Major classes, functions and non-trivial code blocks are well-commented
  • Changes in component dependencies are properly reflected in gn
  • Code follows the style guide
  • Test plan is specified in PR before merging

After-merge Checklist:

Test Plan:

@github-actions github-actions bot added the CI/storybook-url Deploy storybook and provide a unique URL for each build label Dec 12, 2023
@brave-builds
Copy link
Collaborator

A Storybook has been deployed to preview UI for the latest push

<div
className={styles.message}
>
<div dangerouslySetInnerHTML={{ __html: escapeHTMLPolicy.createHTML(turn.text
Copy link
Member

@petemill petemill Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get around the need to use innerHTML. One option would be to use your regex here to build an array of matches and then render everything in a loop

  • regular text block in between a match
  • code block text wrapped with special element or class

And for each regular block you could use the same pattern for any inline code block matches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, changed. i'm building the string into parts and wrapping relevant html around it. this works, nicely.

however, i'm curious how expensive using regex is here

@@ -53,6 +53,8 @@ The current time and date is <ph name="DATE">$1</ph>
Your name is Leo, a helpful, respectful and honest AI assistant created by the company Brave. You will be replying to a user of the Brave browser. Always respond in a neutral tone. Be polite and courteous. Answer concisely in no more than 50-80 words.

Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.

If you are asked to write code or json, then use this format: ```language```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like "Use unicode symbols for formatting where appropriate. Use backticks (`) to wrap inline coding-related words and triple backticks (```) to wrap blocks of code or data." cc @LorenzoMinto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@brave-builds
Copy link
Collaborator

A Storybook has been deployed to preview UI for the latest push

@nullhook nullhook marked this pull request as ready for review December 19, 2023 20:21
@nullhook nullhook requested a review from a team as a code owner December 19, 2023 20:21
@nullhook nullhook changed the title aichat: code layout aichat: code formatting Dec 19, 2023
@brave-builds
Copy link
Collaborator

A Storybook has been deployed to preview UI for the latest push

Copy link
Collaborator

@mkarolin mkarolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strings++

@nullhook nullhook requested a review from a team as a code owner December 20, 2023 00:13
Copy link

New dependencies detected. Learn more about Socket for GitHub ↗︎

Packages Version New capabilities Transitives Size Publisher
@types/react-syntax-highlighter 15.5.11 None +0 195 kB types
react-syntax-highlighter 15.5.0 filesystem +20 9.48 MB simmerer

@diracdeltas diracdeltas added the CI/run-audit-deps Check for known npm/cargo vulnerabilities (audit_deps) label Dec 20, 2023
@diracdeltas
Copy link
Member

@brave/devops-team why was audit-deps not automatically tagged on this PR?

@wknapik
Copy link
Collaborator

wknapik commented Dec 20, 2023

@diracdeltas the label is added automatically based on the files touched by the PR. The files modified here didn't match any of the existing patterns.

@brave-builds
Copy link
Collaborator

A Storybook has been deployed to preview UI for the latest push

@diracdeltas
Copy link
Member

@diracdeltas the label is added automatically based on the files touched by the PR. The files modified here didn't match any of the existing patterns.

i don't understand, since this PR does modify package.json and package-lock.json

@wknapik
Copy link
Collaborator

wknapik commented Dec 20, 2023

@diracdeltas the label is added automatically based on the files touched by the PR. The files modified here didn't match any of the existing patterns.

i don't understand, since this PR does modify package.json and package-lock.json

Hmm, I thought the commit that touched package*.json happened after you asked the question, but actually it looks like it happened 7min before.

I think we need to set up this github action differently. It currently doesn't trigger on all the relevant events. I'll look into it.

@wknapik wknapik mentioned this pull request Dec 20, 2023
25 tasks
@wknapik wknapik reopened this Dec 20, 2023
@wknapik
Copy link
Collaborator

wknapik commented Dec 20, 2023

@diracdeltas should be fixed

@wknapik wknapik removed their assignment Dec 20, 2023
@brave-builds
Copy link
Collaborator

A Storybook has been deployed to preview UI for the latest push

@diracdeltas
Copy link
Member

Since this adds new deps I'm tagging @bcaller to do sec review here when he's back.

function buildMessage(str: string) {
const parts: any = []

for (const match of str.matchAll(/```([^\n`]+)?\n?([\s\S]*?)```|`(.*?)`|([^`]+)/gs)) {
Copy link
Contributor

@bcaller bcaller Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this will start/end fenced code blocks even when the backticks don't start at the beginning of a line (not necessarily an issue). Overall, the regex seems OK.

Also note that LLMs are happy writing broken Markdown. We had some issues with Brave Search CodeLLM.

image

@bcaller
Copy link
Contributor

bcaller commented Jan 5, 2024

Since this adds new deps I'm tagging @bcaller to do sec review here when he's back.

No issue with adding the dependency https://github.com/react-syntax-highlighter/react-syntax-highlighter

@brave-builds
Copy link
Collaborator

A Storybook has been deployed to preview UI for the latest push

function buildMessage(str: string) {
const parts: any = []

for (const match of str.matchAll(/```([^\n`]+)?\n?([\s\S]*?)```|`(.*?)`|([^`]+)/gs)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it more optimal to not create this regex every time the function is run? i.e. put it at the module level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

<div
className={styles.message}
>
{buildMessage(turn.text)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the amount of time this gets re-rendered, should we memoize the result of buildMessage, dependent on turn.text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memoized

@brave-builds
Copy link
Collaborator

A Storybook has been deployed to preview UI for the latest push

Copy link
Member

@petemill petemill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you fix the minor issues that are left, then this looks good to go from me!

text: string
}

function FormattedTextRenderer(props: FormattedTextProps): JSX.Element {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Well, now that this is so simple, you could more simply just wrap this function in React.memo

const SUGGESTION_STATUS_SHOW_BUTTON: mojom.SuggestionGenerationStatus[] = [
mojom.SuggestionGenerationStatus.CanGenerate,
mojom.SuggestionGenerationStatus.IsGenerating
]

const regexp = /```([^\n`]+)?\n?([\s\S]*?)```|`(.*?)`|([^`]+)/gs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name is just the type - how about something more meaningful, like codeFormatRegexp?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also could you add a comment explaining what the regex is attempting to achieve - something about splitting the string in to groups by backtick sequences (if I'm correct about that!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed and added comment

package.json Outdated
@@ -90,6 +90,7 @@
"@types/react": "17.0.2",
"@types/react-dom": "17.0.18",
"@types/react-redux": "7.1.25",
"@types/react-syntax-highlighter": "^15.5.11",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be exact version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

package.json Outdated
@@ -190,6 +191,7 @@
"react-json-view-lite": "^0.9.5",
"react-router": "^5.2.1",
"react-router-dom": "^5.3.0",
"react-syntax-highlighter": "^15.5.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be exact version. I know some others here aren't, but it should be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@petemill
Copy link
Member

SonarCloud is wondering about this

Make sure the regex used here, which is vulnerable to super-linear runtime due to backtracking, cannot lead to denial of service.

image

@petemill
Copy link
Member

I restarted android

@brave-builds
Copy link
Collaborator

A Storybook has been deployed to preview UI for the latest push

@nullhook nullhook merged commit b5f4ad0 into master Jan 17, 2024
18 checks passed
@nullhook nullhook deleted the ai-chat-code-formatting branch January 17, 2024 17:38
@github-actions github-actions bot added this to the 1.63.x - Nightly milestone Jan 17, 2024
@bcaller
Copy link
Contributor

bcaller commented Jan 18, 2024

SonarCloud is wondering about this

Make sure the regex used here, which is vulnerable to super-linear runtime due to backtracking, cannot lead to denial of service.

The section should be approximately quadratic:

([^\n`]+)?\n?([\s\S]*?)```

because a long string of 'A's which doesn't end in three backticks will match both the first [^\n`]+ and the [\s\S]* (and the \n? in the middle).

However, causing any slow down would require the AI to produce such string with a length > 10000 I think and then it would only be client-side and for a second or two. Double the length would quadruple the time taken to process but I'd hope the AI rarely goes that crazy.

@stephendonner
Copy link
Contributor

stephendonner commented Jan 19, 2024

Verified PASSED using

Brave | 1.64.3 Chromium: 121.0.6167.57 (Official Build) nightly (64-bit)
-- | --
Revision | 0ecd8a5470d06ce96921cb7585ecbc0f9878f3cf
OS | Windows 10 Version 22H2 (Build 19045.3930)

Steps:

  1. installed 1.64.3
  2. launched Brave using --env-leo=staging --env-ai-chat.bsg=dev --env-ai-chat-premium.bsg=dev --enable-logging=stderr --v=2
  3. loaded cnn.com (not necessary)
  4. clicked Summarize this page
  5. asked a slight variation of "write me a "Hello, world!" program using/in {language}`

Confirmed output was language-appropriate, in a code block, and accompanied by a copy code button

Python

Sample code:

print("Hello, world!")

image

C++

Sample code:

#include <iostream>

int main() {
  std::cout << "Hello, world!" << std::endl;
  return 0;
}

image

.NET

image

JavaScript

image

kjozwiak pushed a commit that referenced this pull request Jan 19, 2024
* aichat: input is growable (#21124)

* aichat: scroll is interruptable (#21235)

* aichat: model maker text shouldnt look like a link (#21220)

* aichat: code formatting (#21342)

* make claude output formatted code (#21599)
kjozwiak pushed a commit that referenced this pull request Jan 20, 2024
* Merge pull request #21134 from brave/cr121

Upgrade from Chromium 120 to Chromium 121.

* Remove the assert for patch_ffmpeg.py (#21184)

* Merge pull request #21539 from brave/ffmpeg-dynamic-alloc

Use dynamic allocation for ffmpeg fft tables on Windows.

* Merge pull request #21585 from brave/issues/35318

Remove dynamic allocation of ffmpeg ff_tx tables.

* Disables a flaky upstream browser test.

* Merge pull request #21584 from brave/fix_new_tab_button_plus_misaligned

Fixed new tab button's plus icon is mis-aligned with horizontal tab

* Merge pull request #21600 from brave/121.0.6167.75_master

Upgrade from Chromium 121.0.6167.57 to Chromium 121.0.6167.75.

* Merge pull request #21628 from brave/maxk-disable-reading-mode

Hides `Open in Reading Mode` context menu item.

* [Uplift 1.62.x] AI chat issues cr121 1.62.x (#21629)

* aichat: input is growable (#21124)

* aichat: scroll is interruptable (#21235)

* aichat: model maker text shouldnt look like a link (#21220)

* aichat: code formatting (#21342)

* make claude output formatted code (#21599)

---------

Co-authored-by: Mikhail <[email protected]>
Co-authored-by: Aleksey Khoroshilov <[email protected]>
Co-authored-by: Simon Hong <[email protected]>
Co-authored-by: taher <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/run-audit-deps Check for known npm/cargo vulnerabilities (audit_deps) CI/storybook-url Deploy storybook and provide a unique URL for each build needs-security-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code formatting for Leo responses
9 participants