Skip to content

Commit

Permalink
Fixed problem with empty strings in Tokenizer.
Browse files Browse the repository at this point in the history
  • Loading branch information
Bystroushaak committed Mar 21, 2022
1 parent 34ae286 commit 6e13b04
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 1 deletion.
4 changes: 4 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
Changelog
=========

3.0.17
------
- Fixed problem with empty strings in Tokenizer.

3.0.16
------
- Changed behavior of the `.remove_item()` method to compare using identity.
Expand Down
3 changes: 2 additions & 1 deletion src/dhtmlparser3/tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ def tokenize(self) -> List[Token]:

def tokenize_iter(self) -> Iterator[Token]:
if self.end == 0:
raise StopIteration()
yield TextToken(self.string)
return

token = self._scan_token()

Expand Down
12 changes: 12 additions & 0 deletions tests/test_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,3 +248,15 @@ def test_file_parser():

with open(path) as f:
assert f.read() == "<test><a>b</a></test>"


def test_empty_string():
dom = dhtmlparser3.parse("")

assert dom.content == ['']


def test_blank_string():
dom = dhtmlparser3.parse(" ")

assert dom.content == [" "]

0 comments on commit 6e13b04

Please sign in to comment.