Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using AddBlockExpression for table rows #40

Open
Emins opened this issue May 18, 2020 · 2 comments
Open

Using AddBlockExpression for table rows #40

Emins opened this issue May 18, 2020 · 2 comments

Comments

@Emins
Copy link

Emins commented May 18, 2020

Hello dear Rohland, Thank you for this project.

I understand, that this project is not ideal for difficult html, but I`m trying to improve it... :)

So, I found the way to add Block Expressions to find differences in the table rows by:
AddBlockExpression(new Regex(@"<tr(.|\n)*?>(.|\n)*?<\/tr>", RegexOptions.IgnoreCase | RegexOptions.Multiline));

But this code may to select several tr rows for diff.

Any decision to make it for each tr row separately?

@Rohland
Copy link
Owner

Rohland commented May 22, 2020

Hi @Emins

Sorry, I don't have much time to look into this, but I think it comes down to the Regex being used.

You could try this:

<tr[^>]*>.*?</tr>

Let me knnow if this captures each row separately.

@Emins
Copy link
Author

Emins commented May 22, 2020

Thank you for reply, @Rohland

Your regex don`t match rows. But, I have no problem with tr regex. This regex correct detect row as the each "Block":

AddBlockExpression(new Regex(@"<tr(.|\n)*?>(.|\n)*?<\/tr>", RegexOptions.IgnoreCase | RegexOptions.Multiline));

and correct detect changes, for ex in the html code:

<tr>
 <td>
  <p style="text-align: left;">aaa bbb</x:p>
 </td>
</tr>
<tr>
 <td>
  <p style="text-align: left;">aaa ccc</x:p>
 </td>
</tr>

Issue with marking, when 2 rows changed. Now 2 rows was marked del and 2 rows marked ins. I want, 1 del, 1 ins, 1 del, 1 ins.

I thing that issue in the core, in the main conception, because code search for Matching Blocks. And 2 BlockExpression detects as List of changed blocks.

I just found one way, not a good solution, but may be will be helpful for someone. When Inserting replace tags can be added by ins and del order. In the Diff.cs find function:

private void ProcessReplaceOperation(Operation operation)
{
            ProcessDeleteOperation(operation, "diffmod");
            ProcessInsertOperation(operation, "diffmod");
}

replace with:

 private void ProcessReplaceOperation(Operation operation)
        {
            
            //// Test Code to make delete and inserted pairly, if BlockExpression
            //// 
            List<string> text1 = _oldWords.Where((s, pos) => pos >= operation.StartInOld && pos < operation.EndInOld).ToList();
            List<string> text2 = _newWords.Where((s, pos) => pos >= operation.StartInNew && pos < operation.EndInNew).ToList();
            if (!text1.FirstOrDefault()?.Contains("<tr") == true) // todo, improve
            {
                ProcessDeleteOperation(operation, "diffmod");
                ProcessInsertOperation(operation, "diffmod");
            }
            else
            {
                var maxCount = text1.Count > text2.Count ? text1.Count : text2.Count;
                for (int i = 0; i < maxCount; i++)
                {
                    if (text1.ElementAtOrDefault(i) != null)
                        InsertTag(DeleteTagValue, "diffmod", new List<string>() { text1.ElementAt(i) });
                    if (text2.ElementAtOrDefault(i) != null)
                        InsertTag(InsertTagValue, "diffmod", new List<string>() { text2.ElementAt(i) });
                }
            }
        }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants