Skip to content

Validate requests from Web crawlers: impersonating or not?

License

Notifications You must be signed in to change notification settings

simplicitybg/legitbot

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Legitbot Build Status Gem Version

Ruby gem to check that an IP belongs to a bot, typically a search engine. This can be of help in protecting a web site from fake search engines.

Usage

Suppose you have a Web request and you'd like to make sure it's not from a fake search engine:

bot = Legitbot.bot(userAgent, ip)

bot will be nil if no bot signature was found in the User-Agent. Otherwise, it will be an object with methods

bot.detected_as # => :google
bot.valid? # => true
bot.fake? # => false

Sometimes you already know what search engine to expect. For example, you might be using rack-attack:

Rack::Attack.blocklist("fake Googlebot") do |req|
  req.user_agent =~ %r(Googlebot) && Legitbot::Google.fake?(req.ip)
end

Or if you do not like all these nasty crawlers stealing your content or maybe evaluating it and getting ready to invade your site with spammers, then block them all:

Rack::Attack.blocklist 'fake search engines' do |request|
  Legitbot.bot(request.user_agent, request.ip)&.fake?
end

Supported

License

Apache 2.0

References

About

Validate requests from Web crawlers: impersonating or not?

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Ruby 100.0%