System Hardening Phase 1: Catch errors at top level #639

johnnyshields · 2023-11-16T10:45:25Z

In PR #577, @grzuy asked me to break it into a series of smaller PRs.

My plan is as follows:

This PR:
- Move error catching to top-level of Rack::Attack. This is mandatory to do, because we need to have Rack::Attack be aware of errors in order to implement advanced error handling/fault-tolerance mechanisms in subsequent PRs.
- Remove the rescuing blocks from Dalli/RedisProxy classes.
- Add "Rack::Attack.allowed_errors" config which defaults to Dalli::DalliError and Redis::BaseError. This config is useful if you are implementing a custom store class.

Future PRs (will raise the next one after this one is merged)

PR 2:
- Add Rack::Attack.error_handler which takes a Proc for custom error handling.
- Refactor Rack::Attack a bit so the error_handler can select whether to throttle or block.
PR 3: Add Rack::Attack.calling? so Rails Cache error hander can raise its error for Rack::Attack.
PR 4: Add Rack::Attack.failure_cooldown config. This temporarily disables Rack::Attack after an error occurs (including allowed errors), to prevent cache connection latency.

Note: This PR will affect the small number of users are directly using the Dalli/RedisProxy classes (or subclassing them, etc.) since the methods of these classes now raise Dalli::DalliError /Redis::BaseError rather than returning nil. It should probably be released as a major version for this reason.

…nd instead catches errors at the top level.

johnnyshields · 2023-11-16T10:49:38Z

@grzuy ready for your review.

grzuy

@johnnyshields Thanks for breaking it up and opening smaller PR.

Left some comments after first pass.

Lets please update PR title to something that explains what this does for end users. Would that be "Allows users to specify list of ignored errors" or something similar?

grzuy · 2023-11-18T13:11:10Z

README.md

+
+```ruby
+# in initializers/rack_attack.rb
+Rack::Attack.allowed_errors += [MyErrorClass, 'MyOtherErrorClass']


Not strong, thoughts on naming ignored_errors instead allowed_errors? 🤔

@grzuy the reason for "allowed" here is that when these errors occur, RackAttack "allows" the request. This naming will be made more obvious in the next PR, which allows you to add error handlers which throttle or block based on certain errors.

grzuy · 2023-11-18T13:12:30Z

rack-attack.gemspec

@@ -32,6 +32,7 @@ Gem::Specification.new do |s|
  s.add_development_dependency "bundler", ">= 1.17", "< 3.0"
  s.add_development_dependency 'minitest', "~> 5.11"
  s.add_development_dependency "minitest-stub-const", "~> 0.6"
+  s.add_development_dependency 'rspec-mocks', '~> 3.11.0'


Haven't looked at the in more detail but would be great if we wouldn't need to add extra dep.

For this PR need some form of mocking in our tests, and RSpec's mock library is widely-used and has minitest integration.

In the tests you can see I am stubbing certain methods to raise errors. There is no other way to do this in the specs b/c these error would only occur if certain infra such as Redis is down. The only alternative would be to write monkey patches in the tests, which is far worse/messy.

I get your point, but maybe we can change where this gem is used for something similar to how exceptions are currently being forced in tests, and we avoid introducing a new dependency? Or are there other places where rspec-mocks is needed?

Please read my spec/acceptance/error_handling_spec.rb. I am using mocks to trigger various errors which do not normally occur and asserting how they are handled. rspec-mocks was designed exactly for this sort of use case (e.g. testing how novel errors propagate through a system.)

It is certainly possible to force certain errors by strange Redis configurations, however this ultimately is limited in what types of errors we can trigger. I will be using mocks further in subsequent PRs.

grzuy · 2023-11-18T13:14:49Z

spec/acceptance/error_handling_spec.rb

+    end
+
+    it 'can ignore error as String' do
+      Rack::Attack.allowed_errors = %w[RuntimeError]


Can we just go with ["RuntimeError"] so it's more obvious what type it is?

The Rubocop style guide (according to default settings) says a one-element array can be either ['RuntimeError'] or %w[RuntimeError], but a two-element array must use %w[] e.g. %w[RuntimeError OtherError]

Since I'm using both 1 and 2 element arrays this test (as well as elsewhere), I think it is better to use %w[ ] consistently everywhere. Please confirm whether you'd like me to (a) keep as-is, (b) change only one-element arrays, (c) change both 1 and 2-element arrays.

See: https://docs.rubocop.org/rubocop/cops_style.html#stylewordarray

grzuy · 2023-11-18T13:24:31Z

@santib Would be good to have your thoughts on this one too 🙏 :-)

lib/rack/attack.rb

santib · 2023-11-20T14:59:02Z

lib/rack/attack.rb

+    rescue StandardError => error
+      self.class.allow_error?(error) ? @app.call(request.env) : raise(error)


So, here we are rescuing StandardError so we can allow strings in the allowed_errors configuration, right? Is there any reason why to not do this?

rescue *self.allowed_errors => error @app.call(request.env) end

because I prefer the simplicity of this alternative if possible

I did it this way so we can define the allowed_errors as strings, and therefore not have a dependency on Dalli or Redis constants being loaded. Note the allow_error? method is as follows:

def allow_error?(error) allowed_errors&.any? do |allowed_error| case allowed_error when String then error.class.ancestors.any? {|a| a.name == allowed_error } else error.is_a?(allowed_error) end end end

not have a dependency on Dalli or Redis constants being loaded

Agree on this.

Not really sold on the allow_error? method, I'd prefer to use Ruby's exception handling instead, while finding other way to solve the need to not load Dalli/Redis constants.

I know you didn't like it before, but one way could be the Rack::Attack::CacheError class, but there could be others, for example having this class wrap and re-raise the original error, maybe? Or maybe having the defaults defined by each proxy class which also checks for defined?(Dalli) as they already do in other places?

I think this point is really important and we need to take the time to brainstorm/discuss it properly.

@santib In order to invoke rescue *allowed_errors, the allowed_errors variable must contain constants, not strings. We could (could != should) do something like:

def allowed_errors_as_consts @cached_error_classes ||= {} allowed_errors.map do |err| if err.is_a?(String) @cached_error_classes[err] ||= Object.const_get(err) rescue NameError else err end end.compact end

This would need to be done dynamically because Dalli/Redis could be loaded at anytime even after we started using Rack::Attack (and allowed_errors could also be modified.)

This is a lot of silly overhead just to achieve the rescue *list line. The code I have is reasonably fast (string comparison of class names) and low-complexity, so lets go with it for now and not let "perfect" become the enemy of "good enough." We can consider further once the full series of PRs have been merged.

In order to invoke rescue *allowed_errors, the allowed_errors variable must contain constants, not strings.

I know that. But why do we need to also allow Strings? If supporting only constants makes the code simpler, then that works for me.

For reference, there is precedent in other Ruby libs to reference potentially unloaded classes with strings, such as Rails where you can see this:

belongs_to :manager, class_name: "Employee"

Last thing: this is a moot point. Please see these line in the intended final state of this series of PRs:

https://github.com/rack/rack-attack/pull/577/files#diff-9114718ab0fa2d856234c04d73b07c3758a6aaab1dbfbe87401f4eb4e358f127R218-R225

Note I'm adding an additional error_handler here.

Rack::Attack::CacheError

I would really prefer that we do not obscure the errors coming from Redis/Dalli itself. It makes it harder to understand the reasons why Redis/Dalli are failing.

Rack::Attack::CacheError

I would really prefer that we do not obscure the errors coming from Redis/Dalli itself. It makes it harder to understand the reasons why Redis/Dalli are failing.

it could be just a wrapper (as I said in option 2.), and then pass the original error to the application. Having that wrapper would solve the constant issue we are discussing here.

Let's defer this discussion until I get my other PRs merged. There's no reason to bikeshed and introduce complexity at this point.

santib · 2023-11-20T17:08:27Z

README.md

+
+```ruby
+# in initializers/rack_attack.rb
+Rack::Attack.allowed_errors += [MyErrorClass, 'MyOtherErrorClass']


Just to better understand the need to make this configurable, apart from Redis::BaseConnectionError/Redis::BaseError, and Dalli::DalliError which other errors could potentially need to be handled? What was the exception that caused your issue in production?

Some reasons:

If you have some complex logic inside your Rack::Attack handlers that raises errors.

If you implemented your own store proxy (non-Redis/Dalli)

That being said I'd estimate 98%+ of users are fine with the default Redis/Dalli error handling.

johnnyshields · 2023-11-20T17:27:01Z

@santib I feel these discussions are getting lost in the weeds. Ultimately this configurable error handling is not that important.

For now, I am fine to remove the allowed_errors= configuration and just define allowed_errors as follows:

def allowed_errors
   arr = []
   arr << Dalli::DalliError if defined?(Dalli)
   arr << Redis::BaseError if defined?(Redis)
   arr
end

Would you accept this so we can move forward? I'd like to get the other PRs in the series and come back to this issue at the end.

johnnyshields · 2023-11-20T17:53:17Z

Closing in favor of #641. We'll revisit making allowed_errors configurable later.

This PR removes the rescuing blocks from Dalli/RedisProxy classes a…

7f72d4f

…nd instead catches errors at the top level.

grzuy reviewed Nov 18, 2023

View reviewed changes

santib reviewed Nov 20, 2023

View reviewed changes

lib/rack/attack.rb Outdated Show resolved Hide resolved

santib reviewed Nov 20, 2023

View reviewed changes

johnnyshields added 2 commits November 21, 2023 00:06

Move allowed_errors to Configuration

b8f51d0

Update attack.rb

a854656

santib reviewed Nov 20, 2023

View reviewed changes

Merge branch 'main' into catch-errors-at-top-level

944e71a

johnnyshields mentioned this pull request Nov 20, 2023

System Hardening Phase 1: Catch errors at top level, take two #641

Open

johnnyshields closed this Nov 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System Hardening Phase 1: Catch errors at top level #639

System Hardening Phase 1: Catch errors at top level #639

johnnyshields commented Nov 16, 2023 •

edited

Loading

johnnyshields commented Nov 16, 2023

grzuy left a comment

grzuy Nov 18, 2023

johnnyshields Nov 18, 2023

grzuy Nov 18, 2023

johnnyshields Nov 18, 2023 •

edited

Loading

santib Nov 20, 2023

johnnyshields Nov 20, 2023 •

edited

Loading

grzuy Nov 18, 2023

johnnyshields Nov 18, 2023 •

edited

Loading

grzuy commented Nov 18, 2023

santib Nov 20, 2023 •

edited

Loading

johnnyshields Nov 20, 2023 •

edited

Loading

santib Nov 20, 2023

johnnyshields Nov 20, 2023 •

edited

Loading

santib Nov 20, 2023

johnnyshields Nov 20, 2023

johnnyshields Nov 20, 2023 •

edited

Loading

johnnyshields Nov 20, 2023 •

edited

Loading

santib Nov 20, 2023 •

edited

Loading

johnnyshields Nov 20, 2023 •

edited

Loading

santib Nov 20, 2023 •

edited

Loading

johnnyshields Nov 20, 2023

johnnyshields commented Nov 20, 2023 •

edited

Loading

johnnyshields commented Nov 20, 2023 •

edited

Loading

		rescue StandardError => error
		self.class.allow_error?(error) ? @app.call(request.env) : raise(error)

System Hardening Phase 1: Catch errors at top level #639

System Hardening Phase 1: Catch errors at top level #639

Conversation

johnnyshields commented Nov 16, 2023 • edited Loading

johnnyshields commented Nov 16, 2023

grzuy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnnyshields Nov 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnnyshields Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnnyshields Nov 18, 2023 • edited Loading

Choose a reason for hiding this comment

grzuy commented Nov 18, 2023

santib Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

johnnyshields Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnnyshields Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnnyshields Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

johnnyshields Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

santib Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

johnnyshields Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

santib Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnnyshields commented Nov 20, 2023 • edited Loading

johnnyshields commented Nov 20, 2023 • edited Loading

johnnyshields commented Nov 16, 2023 •

edited

Loading

johnnyshields Nov 18, 2023 •

edited

Loading

johnnyshields Nov 20, 2023 •

edited

Loading

johnnyshields Nov 18, 2023 •

edited

Loading

santib Nov 20, 2023 •

edited

Loading

johnnyshields Nov 20, 2023 •

edited

Loading

johnnyshields Nov 20, 2023 •

edited

Loading

johnnyshields Nov 20, 2023 •

edited

Loading

johnnyshields Nov 20, 2023 •

edited

Loading

santib Nov 20, 2023 •

edited

Loading

johnnyshields Nov 20, 2023 •

edited

Loading

santib Nov 20, 2023 •

edited

Loading

johnnyshields commented Nov 20, 2023 •

edited

Loading

johnnyshields commented Nov 20, 2023 •

edited

Loading