Skip to content

Commit

Permalink
Use webrick's escape instead of encode_www_form_component
Browse files Browse the repository at this point in the history
```
irb(main):001:0> require 'webrick'
=> true
irb(main):002:0> URI.parse(URI.encode_www_form_component("http://example.com/path?query=あああ"))
=> #<URI::Generic http%3A%2F%2Fexample.com%2Fpath%3Fquery%3D%E3%81%82%E3%81%82%E3%81%82>
irb(main):003:0> URI.parse(WEBrick::HTTPUtils.escape("http://example.com/path?query=あああ"))
=> #<URI::HTTP http://example.com/path?query=%E3%81%82%E3%81%82%E3%81%82>
```
  • Loading branch information
marocchino committed Jan 6, 2021
1 parent 04a3b05 commit 7b4464b
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 2 deletions.
4 changes: 4 additions & 0 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ test:2.7:
extends: .tests
image: 'ruby:2.7'

test:3.0:
extends: .tests
image: 'ruby:3.0'

test:jruby:
extends: .tests
image: 'jruby:9.2.12-jre'
Expand Down
2 changes: 1 addition & 1 deletion lib/validate_website/crawl.rb
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ def extract_imgs_from_page(page)

page.doc.search('//img[@src]').reduce(Set[]) do |result, elem|
u = elem.attributes['src'].content
result << page.to_absolute(URI.parse(URI.encode_www_form_component(u)))
result << page.to_absolute(URI.parse(WEBrick::HTTPUtils.escape(u)))
end
end

Expand Down
2 changes: 1 addition & 1 deletion lib/validate_website/static_link.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
# rubocop:disable Metrics/BlockLength
StaticLink = Struct.new(:link, :site) do
def link_uri
@link_uri = URI.parse(URI.encode_www_form_component(link))
@link_uri = URI.parse(WEBrick::HTTPUtils.escape(link))
@link_uri = URI.join(site, @link_uri) if @link_uri.host.nil?
@link_uri
end
Expand Down

0 comments on commit 7b4464b

Please sign in to comment.