Skip to content

Releases: oban-bg/oban

v2.13.2

19 Aug 15:15
Compare
Choose a tag to compare

Bug Fixes

  • [Oban] Fix insert/3 and insert_all/3 when using options.

    Multiple default arguments caused a conflict for function calls with options but without an Oban instance name, e.g. Oban.insert(changeset, timeout: 500)

  • [Reindexer] Fix the unused index repair query and correctly report errors.

    Reindexing and deindexing would fail silently because the results weren't checked, and no exceptions were raised.

v2.13.1

10 Aug 00:41
Compare
Choose a tag to compare

Bug Fixes

  • [Oban] Expand insert/insert_all typespecs for multi arity

    This fixes dialyzer issues from the introduction of opts to Oban.insert and Oban.insert_all functions.

  • [Reindexer] Allow specifying timeouts for all queries

    In some cases, applying REINDEX INDEX CONCURRENTLY on the indexes oban_jobs_args_index, and oban_jobs_meta_index takes more than the default value (15 seconds). This new option allows clients to specify other values than the default.

v2.13.0

21 Jul 16:38
Compare
Choose a tag to compare

Cancel Directly from Job Execution

Discard was initially intended to mean "a job exhausted all retries." Later, it was added as a return type for perform/1, and it came to mean either "stop retrying" or "exhausted retries" ambiguously, with no clear way to differentiate. Even later, we introduced cancel with a cancelled state as a way to stop jobs at runtime.

To repair this dichotomy, we're introducing a new {:cancel, reason} return type that transitions jobs to the cancelled state:

case do_some_work(job) do
  {:ok, _result} = ok ->
    ok

  {:error, :invalid} ->
-   {:discard, :invalid}
+   {:cancel, :invalid}

  {:error, _reason} = error ->
    error
end

With this change we're also deprecating the use of discard from perform/1 entirely! The meaning of each action/state is now:

  • cancel—this job was purposefully stopped from retrying, either from a return value or the cancel command triggered by a human

  • discard—this job has exhausted all retries and transitioned by the system

You're encouraged to replace usage of :discard with :cancel throughout your application's workers, but :discard is only soft-deprecated and undocumented now.

Public Engine Behaviour

Engines are responsible for all non-plugin database interaction, from inserting through executing jobs. They're also the intermediate layer that makes Pro's SmartEngine possible.

Along with documenting the Engine this also flattens its name for parity with other "extension" modules. For the sake of consistency with notifiers and peers, the Basic and Inline engines are now Oban.Engines.Basic and Oban.Engines.Inline, respectively.

v2.13.0 — 2022-07-22

Enhancements

  • [Telemetry] Add encode option to make JSON encoding for attach_default_logger/1.

    Now it's possible to use the default logger in applications that prefer structured logging or use a standard JSON log formatter.

  • [Oban] Accept a DateTime for the :with_scheduled option when draining.

    When a DateTime is provided, drains all jobs scheduled up to, and including that point in time.

  • [Oban] Accept extra options for insert/2,4 and insert_all/2,4.

    These are typically the Ecto's standard "Shared Options" such as log and timeout. Other engines, such as Pro's SmartEngine may support additional options.

  • [Repo] Add aggregate/4 wrapper to facilitate aggregates from plugins or other extensions that use Oban.Repo.

Bug Fixes

  • [Oban] Prevent empty maps from matching non-empty maps during uniqueness checks.

  • [Oban] Handle discarded and exhausted states for inline testing mode.

    Previously, returning a :discard tuple or exhausting attempts would cause an error.

  • [Peer] Default leader? check to false on peer timeout.

    Timeouts should be rare, as they're symptoms of application/database overload. If leadership can't be established it's safe to assume an instance isn't leader and log a warning.

  • [Peer] Use node-specific lock requester id for Global peers.

    Occasionally a peer module may hang while establishing leadership. In this case the peer isn't yet a leader, and we can fallback to false.

  • [Config] Validate options only after applying normalizations.

  • [Migrations] Allow any viable prefix in migrations.

  • [Reindexer] Drop invalid Oban indexes before reindexing again.

    Table contention that occurs during concurrent reindexing may leave indexes in an invalid, and unusable state. Those indexes aren't used by Postgres and they take up disk space. Now the Reindexer will drop any invalid indexes before attempting to reindex.

  • [Reindexer] Only concurrently rebuild args and meta GIN indexes.

    The new indexes option can override the reindexed indexes rather than the defaults.

    The other two standard indexes (primary key and compound fields) are BTREE based and not as subject to bloat.

  • [Testing] Fix testing mode for perform_job and alt engines, e.g. Inline

    A couple of changes enabled this compound fix:

    1. Removing the engine override within config and exposing a centralized engine lookup instead.
    2. Controlling post-execution db interaction with a new ack option for the Executor module.

Deprecations

  • [Oban] Soft replace discard with cancel return value (#730) [Parker Selbert]

v2.12.1

26 May 12:22
Compare
Choose a tag to compare

Bug Fixes

  • [BasicEngine] Never fetch jobs that have reached max attempts

    This adds a safeguard to the fetch_jobs function to prevent ever hitting the attempt <= max_attempts check constraint. Hitting the constraint causes the query to fail, which crashes the producer and starts an infinite loop of crashes. The previous commit should prevent this situation from occurring at the "staging" level, but to be absolutely safe this change prevents it at the
    "fetching" level too.

    There is a very minor performance hit from this change because the query can no longer run as an index only scan. For systems with a modest number of available jobs the performance impact is indistinguishable.

  • [Plugins] Prevent unexpectedly modifying jobs selected by subqueries

    Most applications don't run at a serializable isolation level. That allows subqueries to run within a transaction without having the conditions rechecked—only predicates on UPDATE or DELETE are re-checked, not on subqueries. That allows a race condition where rows may be updated without another evaluation.

  • [Repo] Set query_opts in Repo.transaction options to prevent logging begin and commit events in development loggers.

  • [BasicEngine] Remove the ORDER BY clause from unique queries

    The previous ORDER BY id DESC significantly hurts unique query performance when there are a lot of potential jobs to check. The ordering was originally added to make test cases predictable and isn't important for the actual behavior of the unique check.

v2.12.0

21 Apr 22:20
Compare
Choose a tag to compare

Oban v2.12 was dedicated to enriching the testing experience and expanding config, plugin, and queue validation across all environments.

Testing Modes

Testing modes bring a new, vastly improved, way to configure Oban for testing. The new testing option makes it explicit that Oban should operate in a restricted mode for the given environment.

Behind the scenes, the new testing modes rely on layers of validation within Oban's Config module. Now production configuration is validated automatically during test runs. Even though queues and plugins aren't started in the test environment, their configuration is still validated.

To switch, stop overriding plugins and queues and enable a testing mode in your test.exs config:

config :my_app, Oban, testing: :manual

Testing in :manual mode is identical to testing in older versions of Oban: jobs won't run automatically so you can use helpers like assert_enqueued and execute them manually with Oban.drain_queue/2.

An alternate :inline allows Oban to bypass all database interaction and run jobs immediately in the process that enqueued them.

config :my_app, Oban, testing: :inline

Finally, new testing guides cover test setup, unit testing workers, integration testing queues, and testing dynamic configuration.

Global Peer Module

Oban v2.11 introduced centralized leadership via Postgres tables. However, Postgres based leadership isn't always a good fit. For example, an ephemeral leadership mechanism is preferred for integration testing.

In that case, you can make use of the new :global powered peer module for leadership:

config :my_app, Oban,
  peer: Oban.Peers.Global,
  ...

2.12.0 — 2022-04-21

Enhancements

  • [Oban] Replace queue, plugin, and peer test configuration with a single :testing option. Now configuring Oban for testing only requires one change, setting the test mode to either :inline or :manual.

    • :inline—jobs execute immediately within the calling process and without touching the database. This mode is simple and may not be suitable for apps with complex jobs.
    • :manual—jobs are inserted into the database where they can be verified and executed when desired. This mode is more advanced and trades simplicity for flexibility.
  • [Testing] Add with_testing_mode/2 to temporarily change testing modes within the context of a function.

    Once the application starts in a particular testing mode it can't be changed. That's inconvenient if you're running in :inline mode and don't want a particular job to execute inline.

  • [Config] Add validate/1 to aid in testing dynamic Oban configuration.

  • [Config] Validate full plugin and queue options on init, without the need to start plugins or queues.

  • [Peers.Global] Add an alternate :global powered peer module.

  • [Plugin] A new Oban.Plugin behaviour formalizes starting and validating plugins. The behaviour is implemented by all plugins and is the foundation of enhanced config validation.

  • [Plugin] Emit [:oban, :plugin, :init] event on init from every plugin.

Bug Fixes

  • [Executor ] Skip timeout check with an unknown worker

    When the worker can't be resolved we don't need to check the timeout. Doing so prevents returning a helpful "unknown worker" message and instead causes a function error for nil.timeout/1.

  • [Testing] Include log and prefix in generated conf for perform_job.

    The opts, and subsequent conf, built for perform_job didn't include the prefix or log options. That prevented functions that depend on a job's conf within perform/1 from running with the correct options.

  • [Drainer] Retain the currently configured engine while draining a queue.

  • [Watchman] Skip pausing queues when shutdown is immediate. This prevents queue's from interacting with the database during short test runs.

v2.11.0

13 Feb 16:09
Compare
Choose a tag to compare

Oban v2.11 Upgrade Guide

⚠️📓 Oban v2.11 requires a v11 migration, Elixir v1.11+ and Postgres v10.0+

Oban v2.11 focused on reducing database load, bolstering telemetry-powered introspection, and improving the production experience for all users. To that end, we've extracted functionality from Oban Pro and switched to a new global coordination model.

Leadership

Coordination between nodes running Oban is crucial to how many plugins operate. Staging jobs once a second from multiple nodes is wasteful, as is pruning, rescuing, or scheduling cron jobs. Prior Oban versions used transactional advisory locks to prevent plugins from running concurrently, but there were some issues:

  • Plugins don't know if they'll take the advisory lock, so they still need to run a query periodically.

  • Nodes don't usually start simultaneously, and time drifts between machines. There's no guarantee that the top of the minute for one node is the same as another's—chances are, they don't match.

Oban 2.11 introduces a table-based leadership mechanism that guarantees only one node in a cluster, where "cluster" means a bunch of nodes connected to the same Postgres database, will run plugins. Leadership is transparent and designed for resiliency with minimum chatter between nodes.

See the [Upgrade Guide][upg] for instructions on how to create the peers table and get started with leadership. If you're curious about the implementation details or want to use leadership in your application, take a look at docs for Oban.Peer.

Alternative PG (Process Groups) Notifier

Oban relies heavily on PubSub, and until now it only provided a Postgres adapter. Postres is amazing, and has a highly performant PubSub option, but it doesn't work in every environment (we're looking at you, PG Bouncer).

Fortunately, many Elixir applications run in a cluster connected by distributed Erlang. That means Process Groups, aka PG, is available for many applications.

So, we pulled Oban Pro's PG notifier into Oban to make it available for everyone! If your app runs in a proper cluster, you can switch over to the PG notifier:

config :my_app, Oban,
  notifier: Oban.Notifiers.PG,
  ...

Now there are two notifiers to choose from, each with their own strengths and weaknesses:

  • Oban.Notifiers.Postgres — Pros: Doesn't require distributed erlang, publishes insert events to trigger queues; Cons: Doesn't work with PGBouncer intransaction mode, Doesn't work in tests because of the sandbox.

  • Oban.Notifiers.PG — Pros: Works PG Bouncer in transaction mode, Works in tests; Cons: Requires distributed Erlang, Doesn't publish insert events.

Basic Lifeline Plugin

When a queue's producer crashes or a node shuts down before a job finishes executing, the job may be left in an executing state. The worst part is that these jobs—which we call "orphans"—are completely invisible until you go searching through the jobs table.

Oban Pro has awlays had a "Lifeline" plugin for just this ocassion—and now we've brought a basic Lifeline plugin to Oban.

To automatically rescue orphaned jobs that are still executing, include the Oban.Plugins.Lifeline in your configuration:

config :my_app, Oban,
  plugins: [Oban.Plugins.Lifeline],
  ...

Now the plugin will search and rescue orphans after they've lingered for 60 minutes.

🌟 Note: The Lifeline plugin may transition jobs that are genuinely executing and cause duplicate execution. For more accurate rescuing or to rescue jobs that have exhausted retry attempts see the DynamicLifeline plugin in Oban Pro.

Reindexer Plugin

Over time various Oban indexes (heck, any indexes) may grow without VACUUM cleaning them up properly. When this happens, rebuilding the indexes will release bloat and free up space in your Postgres instance.

The new Reindexer plugin makes index maintenance painless and automatic by periodically rebuilding all of your Oban indexes concurrently, without any locks.

By default, reindexing happens once a day at midnight UTC, but it's configurable with a standard cron expression (and timezone).

config :my_app, Oban,
  plugins: [Oban.Plugins.Reindexer],
  ...

See Oban.Plugins.Reindexer for complete options and implementation details.

Improved Telemetry and Logging

The default telemetry backed logger includes more job fields and metadata about execution. Most notably, the execution state and formatted error reports when jobs fail.

Here's an example of the default output for a successful job:

{
  "args":{"action":"OK","ref":1},
  "attempt":1,
  "duration":4327295,
  "event":"job:stop",
  "id":123,
  "max_attempts":20,
  "meta":{},
  "queue":"alpha",
  "queue_time":3127905,
  "source":"oban",
  "state":"success",
  "tags":[],
  "worker":"Oban.Integration.Worker"
}

Now, here's an sample where the job has encountered an error:

{
  "attempt": 1,
  "duration": 5432,
  "error": "** (Oban.PerformError) Oban.Integration.Worker failed with {:error, \"ERROR\"}",
  "event": "job:exception",
  "state": "failure",
  "worker": "Oban.Integration.Worker"
}

2.11.0 — 2022-02-13

Enhancements

  • [Migration] Change the order of fields in the base index used for the primary Oban queries.

    The new order is much faster for frequent queries such as scheduled job staging. Check the v2.11 upgrade guide for instructions on swapping the index in existing applications.

  • [Worker] Avoid spawning a separate task for workers that use timeouts.

  • [Engine] Add insert_job, insert_all_jobs, retry_job, and retry_all_jobs as required callbacks for all engines.

  • [Oban] Raise more informative error messages for missing or malformed plugins.

    Now missing plugins have a different error from invalid plugins or invalid options.

  • [Telemetry] Normalize telemetry metadata for all engine operations:

    • Include changeset for insert
    • Include changesets for insert_all
    • Include job for complete_job, discard_job, etc
  • [Repo] Include [oban_conf: conf] in telemetry_options for all Repo operations.

    With this change it's possible to differentiate between database calls made by Oban versus the rest of your application.

Bug Fixes

  • [Telemetry] Emit discard rather than error events when a job exhausts all retries.

    Previously discard_job was only called for manual discards, i.e., when a job returned :discard or {:discard, reason}. Discarding for exhausted attempts was done within error_job in error cases.

  • [Cron] Respect the current timezone for @reboot jobs. Previously, @reboot expressions were evaluated on boot without the timezone applied. In that case the expression may not match the calculated time and jobs wouldn't trigger.

  • [Cron] Delay CRON evaluation until the next minute after initialization. Now all cron scheduling ocurrs reliably at the top of the minute.

  • [Drainer] Introduce discard accumulator for draining results. Now exhausted jobs along with manual discards count as a discard rather than a failure or success.

  • [Oban] Expand changeset wrapper within multi function.

    Previously, Oban.insert_all could handle a list of changesets, a wrapper map with a :changesets key, or a function. However, the function had to return a list of changesets rather than a changeset wrapper. This was unexpected and made some multi's awkward.

  • [Testing] Preserve attempted_at/scheduled_at in perform_job/3 rather than overwriting them with the current time.

  • [Oban] Include false as a viable queue or plugin option in typespecs

Deprecations

  • [Telemetry] Hard deprecate Telemetry.span/3, previously it was soft-deprecated.

Removals

  • [Telemetry] Remove circuit breaker event documentation because :circuit events aren't emitted anymore.

v2.10.1

09 Nov 22:17
Compare
Choose a tag to compare

The previous release, v2.10.0 was immediately retired in favor of this version.

Removed

  • [Oban.Telemetry] Remove the customizable prefix for telemetry events in favor of workarounds such as keep/drop in Telemetry Metrics.

v2.10.0

09 Nov 21:22
Compare
Choose a tag to compare

Added

  • [Oban.Telemetry] Add customizable prefix for all telemetry events.

    For example, a telemetry prefix of [:my_app, :oban] would span job start telemetry events as [:my_app, :oban, :job, :start]. The default is [:oban], which matches the existing functionality.

Fixed

  • [Oban.Plugins.Stager] Use the notifier to broadcast inserted and available jobs rather than inlining them into a Postgres query.

    With this change the notifier is entirely swappable and there isn't any reason to use the Repeater plugin in production.

  • [Oban.Plugins.Cron] Validate job options on init.

    Providing invalid job args in the cron tab, e.g. priority: 5 or unique: [], wasn't caught until runtime. At that point each insert attempt would fail, crashing the plugin.

  • [Oban.Queue.Producer] Prevent crashing on exception formatting when a job exits without a stacktrace, most notably with {:EXIT, pid}.

  • [Oban.Testing] Return invalid results from perform_job, rather than always returning nil.

  • [Oban] Validate that a queue exists when controlling or checking locally, e.g. calls to Oban.check_queue or Oban.scale_queue.

  • [Oban.Telemetry] Use module capture for telemetry logging to prevent warnings.

v2.9.2

28 Sep 01:25
Compare
Choose a tag to compare
  • [Oban] Loosen telemetry requirement to allow either 0.4 or 1.0 without forcing apps to use an override.

v2.9.1

28 Sep 01:25
Compare
Choose a tag to compare

Fixed

  • [Oban] Correctly handle prefix in cancel_job and cancel_all_jobs.

  • [Oban] Safely guard against empty changeset lits passed to insert_all/2,4.