-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird characters in loaded HTML #187
Comments
It's a bug. Three was unwanted nulls after first package from webserver if it's size been less than 4096. Little messy test, that illustrate this bug. using CsQuery.HtmlParser;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using NUnit.Framework;
using System;
using System.IO;
using System.Text;
using Assert = NUnit.Framework.Assert;
namespace CsQuery.Tests.Issues
{
[TestFixture, TestClass]
public class Issue187 : CsQueryTest
{
[Test, TestMethod]
public void Issue187Test()
{
using (var mockStream = new Issue187MockStream())
{
var factory = new ElementFactory();
var dom = factory.Parse(mockStream, Encoding.UTF8);
Assert.AreEqual(Issue187MockStream.HTML, dom.FirstChild.OuterHTML);
}
}
}
public class Issue187MockStream : Stream
{
public const string HTML = @"<html><head></head><body><a href=""http://test.example.com"">Test</a></body></html>";
public override int Read(byte[] buffer, int offset, int count)
{
byte[] bytes = Encoding.UTF8.GetBytes(HTML);
int splitPosition = bytes.Length / 2;
int lenght;
if (Position == 0)
{
lenght = splitPosition;
Array.Copy(bytes, buffer, splitPosition);
}
else if (Position == splitPosition)
{
lenght = bytes.Length - splitPosition;
Array.Copy(bytes, splitPosition, buffer, 0, lenght);
}
else
{
lenght = 0;
}
Position += lenght;
return lenght;
}
public override bool CanRead { get { return true; } }
public override bool CanSeek { get { return false; } }
public override bool CanWrite { get { return false; } }
public override long Position { get; set; }
public override void Flush() { return; }
public override long Length { get { throw new NotImplementedException(); } }
public override long Seek(long offset, SeekOrigin origin) { throw new NotImplementedException(); }
public override void SetLength(long value) { throw new NotImplementedException(); }
public override void Write(byte[] buffer, int offset, int count) { throw new NotImplementedException(); }
}
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I noticed some weird characters popping up in the HTML when using
CQ.CreateFromUrl
.Here is an example:
When you execute above example (in LinqPad for example) you'll notice in the output:
I have no idea where the weird characters come from. I don't see them in the HTML source when loading it in the browser or in Sublime Text. If I load the page in c# into a string and then load the string into a CQ object it works without problems.
Do you have any idea what this could be?
Thanks.
The text was updated successfully, but these errors were encountered: