Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) [WE-937] Pasting results in \n's sometimes #3

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

smurflo2
Copy link

@smurflo2 smurflo2 commented May 5, 2022

The Problem

When you copy text from certain websites (e.g. Google docs, Gmail) the text will appear fine in the text editor, but will appear with formatting issues after sending the message.

image

Some weird things going on here:

  1. "Test paste" getting bolded for no reason (it wasn't bolded in the text I copied)
  2. The line breaks appearing as \n instead of as line breaks

Why is this happening

When you copy text from sites like gmail / google docs, both plain text and html get copied to your clipboard. Right now, we have this line in MarkdownPaste -> handlePaste: if (text.length === 0 || html) return false; which basically translates to "there's html on the clipboard, just let the default Prosemirror HTML parser handle it"

In the text box this ends up getting pasted like this:

<p class="">
  <strong>Test paste</strong>
</p>
<p class="">
  <strong><br><br><br><br></strong>
  <br>
</p>
<p>from</p>
<p>
  <strong><br><br></strong>
  <br>
</p>
<p>Google docs</p>
<p>
  <strong><br><br><br><br></strong>
  <br>
</p>
<p>With newlines</p>
<p class="current-element"><br><br></p>

For reference, here's what the html looks like when you manually type in something like that

<p class="">When</p>
<p class=""><br></p>
<p class=""><br></p>
<p class=""><br></p>
<p class=""><br></p>
<p class="">manually</p>
<p class=""><br></p>
<p class=""><br></p>
<p class="current-element">typed</p>

But to get the text from the text box to a message bubble, we need to serialize content of the text box (serializer.serialize(view.state.doc)), save that serialized content, and then re-render it with markdown.

The problem is at the serialize step, it produces this string:

**Test paste**

** \n  \n  \n  \n **

from

** \n  \n **

Google docs

** \n  \n  \n  \n **

With newlines

 \n

For reference, here's the manually typed version when serialized

When





manually



typed

So to sum up so far: <p class=""><br></p> gets serialized as a normal newline (as does <p class="">some text</p>). Extraneous
tags in

tags get translated as \n

So what do we do?

Well, the best option would be to fix the serialize method... but this is where I get lost. The serialize method is pretty abstracted and the interplay between the different extensions confuses me. Some relevant pieces of code that I've tracked down:

  1. const serializer = extensions.serializer(); editor.current.serializer = serializer; in useEditor.ts (line 348) initializes the serializer that we'll be using
  2. serializer() in ExtensionManager.ts (line 46) initializes the serializer object
  3. serialize(content, options) in serializer.js (line 52). The MarkdownSerializerState class in this file looks like it's doing a lot of heavy lifting here.
  4. state.write(' \\n '); in HardBreak.ts -> toMarkdown (line 39) seems to be the culprit as far as actually writing the \n

Current Fix

My current fix is to only use the default Prosemirror HTML parser if text.length === 0 AND there's html on the clipboard. This causes the text to paste as plain text, removing the weird line break stuff. However, this is not without downsides: 1) multiple newlines are turned into a single newline (easily fixable, but a minor annoyance) 2) more significantly, this makes pasting rich text paste as plain text. This is good in some situations and bad in others. The bad: if you paste something that has a hyperlink, the hyperlink won't get automatically copied over. The good: Pasting html can get weird sometimes. For example, if you copy a message bubble and paste it in, it'll get interpreted as a bullet point (bc the underling html has message bubbles as

  • 's), which is unexpected

    There's gotta be a better way. I'll keep working on this. Feel free to poke around if you come across this.

  • Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    1 participant