You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# With some added inspiration from Purell: https://github.com/PuerkitoBio/purell
# and normalize_url: https://github.com/rwz/normalize_url
moduleSurt
I wrote it because we needed URL canonicalization tools, none of the existing Ruby ones I could find quite met our needs perfectly, and having a method that roughly matched the Internet Archive’s was advantageous. Nobody had written a Ruby port of SURT.
Since we have generally been working to break more reusable, abstract pieces out of the web monitoring projects, this is probably a really good candidate for that on the ruby side. It might be nice to extract it and publish it as a Ruby Gem. (Gem name: SURT, repo name: edgi-govdata-archiving/ruby-surt)
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in seven days if no further activity occurs. If it should not be closed, please comment! Thank you for your contributions.
This project has a nearly complete Ruby port of the Internet Archive’s SURT Python package buried in the
app/lib/
directory:web-monitoring-db/app/lib/surt.rb
Lines 3 to 20 in 3bb7e8a
I wrote it because we needed URL canonicalization tools, none of the existing Ruby ones I could find quite met our needs perfectly, and having a method that roughly matched the Internet Archive’s was advantageous. Nobody had written a Ruby port of SURT.
Since we have generally been working to break more reusable, abstract pieces out of the web monitoring projects, this is probably a really good candidate for that on the ruby side. It might be nice to extract it and publish it as a Ruby Gem. (Gem name:
SURT
, repo name:edgi-govdata-archiving/ruby-surt
)The text was updated successfully, but these errors were encountered: