Skip to content

Commit

Permalink
🐛 relax TypeError with a CharsetMatch instance when trying to compare…
Browse files Browse the repository at this point in the history
… it with anything else than a CharsetMatch instance (#444)
  • Loading branch information
Ousret authored Mar 19, 2024
1 parent 1e3ad85 commit e8d70d3
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 5 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@
All notable changes to charset-normalizer will be documented in this file. This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [3.3.3](https://github.com/Ousret/charset_normalizer/compare/3.3.2...master) (2024-03-??)

### Fixed
- Relax the TypeError exception thrown when trying to compare a CharsetMatch with anything else than a CharsetMatch.

### Changed
- Optional mypyc compilation upgraded to version 1.9.0 for Python >= 3.8

## [3.3.2](https://github.com/Ousret/charset_normalizer/compare/3.3.1...3.3.2) (2023-10-31)

### Fixed
Expand Down
8 changes: 3 additions & 5 deletions charset_normalizer/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,11 +35,9 @@ def __init__(

def __eq__(self, other: object) -> bool:
if not isinstance(other, CharsetMatch):
raise TypeError(
"__eq__ cannot be invoked on {} and {}.".format(
str(other.__class__), str(self.__class__)
)
)
if isinstance(other, str):
return iana_name(other) == self.encoding
return False
return self.encoding == other.encoding and self.fingerprint == other.fingerprint

def __lt__(self, other: object) -> bool:
Expand Down
11 changes: 11 additions & 0 deletions tests/test_base_detection.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,3 +123,14 @@ def test_doc_example_short_cp1251():
).best()

assert best_guess.encoding == "cp1251"


def test_direct_cmp_charset_match():
best_guess = from_bytes(
"😀 Hello World! How affairs are going? 😀".encode("utf_8")
).best()

assert best_guess == "utf_8"
assert best_guess == "utf-8"
assert best_guess != 8
assert best_guess != None

0 comments on commit e8d70d3

Please sign in to comment.