Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

jawah / charset_normalizer Public

Notifications You must be signed in to change notification settings
Fork 50
Star 595

Code
Issues 1
Pull requests
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Releases: jawah/charset_normalizer

Releases Tags

Releases · jawah/charset_normalizer

Charset Normalizer

23 Sep 13:02

Ousret

1.1.1

5abfb83

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

Charset Normalizer

Changes :

Bugfix : from_bytes parameters steps and chunk_size were not adapted to sequence len if provided values were not fitted to content. Therefore could lead to misdetection on small content.

Assets 2

All reactions

Charset Normalizer

21 Sep 16:17

Ousret

1.1.0

9728ff7

Compare

Choose a tag to compare

View all tags

Charset Normalizer

Changes :

Bugfix : Sequence having lenght bellow 10 chars was not checked by ProbeChaos at all. (#14)
Bugfix : Legacy detect method inspired by chardet was not returning intended result when having no result. (#14)

Assets 2

All reactions

Charset Normalizer

17 Sep 17:17

Ousret

1.0.0

d3996ce

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

Charset Normalizer

Release 1.0.0 (#11)

* Adjustement in frequencies.json about Chinese

Remove latin based char in it

* Added the possibility to list encoding aliases for a match

Encoding name are known by many name, using this could help when searching for IBM855 when it's listed as CP855.

* Added submatch in match

list of submatch that produce the EXACT same output as a match

* Changes in docs

+ comment unused code.

* Add param in doc ProbeChaos giveup_threshold

* Doc improvement in unicode.py

* Add static method list_by_range in unicode.py

Sort letters by unicode range in a dict

* ProbeCoherence reliability improved 

Can now probe & sort by alphabet used or unicode range.

* Added coherence_non_latin method in NormalizerMatch

Verify if a non latin based language got verified by probe coherence

* CLI is now more verbose

* More tests, yay !

* bump 1.0.0

* readme upd8

Assets 2