Skip to content
forked from f34nk/tidy_ex

Elixir binding to the granddaddy of HTML tools

License

Notifications You must be signed in to change notification settings

matiaslb/tidy_ex

 
 

Repository files navigation

Build status ModestEx version Hex.pm

Broom by faisalovers from the Noun Project

TidyEx

TidyEx corrects and cleans up HTML content by fixing markup errors.

Elixir/Erlang bindings for htacg's tidy-html5

The granddaddy of HTML tools, with support for modern standards http://www.html-tidy.org

The binding is implemented as a C-Node following the excellent example in Overbryd's package nodex. If you want to learn how to set up bindings to C/C++, you should definitely check it out.

  • nodex
    • distributed Elixir
    • save binding with C-Nodes

C-Nodes are external os-processes that communicate with the Erlang VM through erlang messaging. That way you can implement native code and call into it from Elixir in a safe predictable way. The Erlang VM stays unaffected by crashes of the external process.

Example

For more examples please checkout tests.

test "can parse broken html" do
  result = TidyEx.parse("<div>Hello<span>World")
  assert result == "<div>Hello<span>World</span></div>"
end

test "can clean and repair broken html" do
  result = TidyEx.clean_and_repair("<div>Hello<span>World")
  assert result == "<div>Hello<span>World</span></div>"
end

test "can run diagnostics on invalid html" do
  result = TidyEx.run_diagnostics("<pp>Hello World</p>")
  assert result == "line 1 column 1 - Error: <pp> is not recognized!\nThis document has errors that must be fixed before\nusing HTML Tidy to generate a tidied up version."
end

Installation

Available on hex.

def deps do
  [
    {:tidy_ex, "~> 0.1.0-dev"}
  ]
end

Target dependencies

cmake 3.x
erlang-dev
erlang-xmerl
erlang-parsetools

Compile and test

mix deps.get
mix compile
mix test

Cloning

git clone [email protected]:f34nk/tidy_ex.git
cd tidy_ex

All binding targets are added as submodules in the target/ folder.

git submodule update --init --recursive --remote
mix deps.get
mix compile
mix test
mix test.target

Cleanup

mix clean

Roadmap

See CHANGELOG.

  • Bindings
    • Call as C-Node
    • Call as dirty-nif
  • Tests
    • Call as C-Node
    • Call as dirty-nif
    • Target tests
    • Feature tests
    • Package test
  • Features
    • Set tidy-html5 options
    • Serialize any string with valid or broken html
    • Clean and repair
    • Run diagnostics
  • Documentation
  • Publish as hex package

Icon Credit

Broom by faisalovers from the Noun Project

About

Elixir binding to the granddaddy of HTML tools

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 60.7%
  • Elixir 13.1%
  • C++ 10.9%
  • CMake 6.3%
  • HTML 4.8%
  • Shell 4.2%