Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot set Mechanize page via Metadata's page method #42

Open
derantell opened this issue Dec 28, 2014 · 4 comments
Open

Cannot set Mechanize page via Metadata's page method #42

derantell opened this issue Dec 28, 2014 · 4 comments

Comments

@derantell
Copy link

When trying to use a pre-fethed Mechanize page as described in the wiki:

Wombat.crawl do
  m = Mechanize.new 
  mp = m.get 'http://www.google.com'
  page mp
end

I get an error with this stack trace:

crawler.rb:8:in `block in <main>': wrong number of arguments (1 for 0) (ArgumentError)
    from /Users/derantell/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/wombat-2.3.0/lib/wombat/crawler.rb:22:in `instance_eval'
    from /Users/derantell/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/wombat-2.3.0/lib/wombat/crawler.rb:22:in `crawl'
    from /Users/derantell/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/wombat-2.3.0/lib/wombat.rb:13:in `crawl'
    from crawler.rb:4:in `<main>'

Using @metadata_dup.page mp or renaming Metadata::page to something else works, therefore my guess is that the attr_accessor :page which Crawler includes from Parser is found and method_missing is never invoked.

Versions used: ruby 2.1.2, mechanize 2.7.3 and wombat 2.3.0

@acidghost
Copy link

@felipecsl I'm having the same problem here... @derantell did you ever resolved this?

@shashwatsingh
Copy link

Same issue and gem versions same as derantell listed but on Linux 3.16 x86_64 instead.

Did anyone else have this issue? I tried following - which works, but if someone has a better way, that would be great:

module Wombat
  module DSL
    class Metadata
      alias_method :mech_page_ref, :page
    end
  end
end

and then used it as follows:

data = Wombat.crawl do
  mech_page_ref mech
end

@cyu
Copy link
Contributor

cyu commented Oct 18, 2015

I just ran into this issue as well. The issue is that Wombat::Processing::Parser defines a page accessor (https://github.com/felipecsl/wombat/blob/master/lib/wombat/processing/parser.rb#L24). Here's my workaround to the issue:

# cannot use Wombat#crawl
class Crawler
  include Wombat::Crawler
  og_image xpath: '//html/head/meta[@property = "og:image"]/@content'
  twitter_image_src xpath: '//html/head/meta[@name = "twitter:image:src"]/@content'
end

crawler = Crawler.new
page = crawler.mechanize.get(uri.to_s)
crawler.metadata[:page] = page
result = crawler.crawl

@smileart
Copy link

Workarounds are great and all that, but is there any chance to get it fixed in the project itself? ☹️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants