Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Things go wonky when HTML is rendered. Escape < and > #22

Open
expecttheunusual opened this issue Mar 31, 2023 · 2 comments
Open

Things go wonky when HTML is rendered. Escape < and > #22

expecttheunusual opened this issue Mar 31, 2023 · 2 comments

Comments

@expecttheunusual
Copy link

Title.

Ask for it to do anything in HTML, the thing shuffles in madness. Will you fix it so the output and input is properly escaped?

Thanks

@josephrocca
Copy link
Owner

Hmm, this may be hard to get around depending on specifically what you're referring to - any chance you could record a video, or give an example prompt? I did just fix a problem with code blocks - so they display more consistently/nicely during streaming. I may be able to do something similar with what you're talking about.

@kickahaota
Copy link

This seemed interesting, so I had a look.

Normally, when you ask the LLM to write HTML for you, it will enclose the result in ````html ` markdown, like this:

[USER]: Provide HTML for a simple web page which displays "Sample Page" as its title and "Hello World" as its body text.

[AI]: Sure! Here's a simple HTML document that displays "Sample Page" as its title and "Hello World" as its body text:

```html

<html>
<head>
  <title>Sample Page</title>
</head>
<body>
  <h1>Hello World</h1>
</body>
</html>

```

That causes the HTML to be shown very nicely in the response:

<!DOCTYPE html>
<html>
<head>
  <title>Sample Page</title>
</head>
<body>
  <h1>Hello World</h1>
</body>
</html>

But if the LLM uses HTML in a response without marking it in this way, then the HTML gets rendered as if it were part of your UI, which naturally messes things up. I had to try very, very hard to get the LLM to make this mistake; but it can also happen if someone imports a conversation that uses the less-than and greater-than characters without markdown.

In most web projects that use arbitrary text, the right thing to do is to sanitize the text by replacing < with &lt;, & with &amp;, and so on. But it's trickier here, because you don't want to sanitize text that's already been marked down properly. So if this were my project, I'd be inclined to won't-fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants