Load government data in an asynchronous task #1522

Bulk data is the namespace that we will be using for download large amounts of data periodically for use in Content Publisher. Some examples of candidates for this are topics, contacts and organisations. These are data entities where we need to download all of the Publishing API's data rather than a small sampling of on demand data. Previously we have loaded this data on demand and it risks periodic timeouts for the application. As this data can be loaded in numerous places it also makes it difficult to determine performance issues in the application as it may be a contributing factor to any other tasks we think may be slow. This module contains two common exceptions that we're anticipating to use. LocalDataUnavailableError is for situations when we're expecting data to be available in our cache but it is not present. RemoteDataUnavailableError is for when we try to load remote data but this fails.

This class is intended to be the entry point for writing any BulkData. It uses the Rails Redis cache as an underlying store. Using Redis was a design choice for the following reasons: - Redis offers caching across multiple GOV.UK boxes, so one write will cover all machines - Redis is fast to read and write from And using the Rails cache implementation offered the following advantages: - Can be configured with [middleware][1] to avoid need to hit Redis multiple times in a request - Redis namespacing - Automatic serialisation/deserialisation An important concern is that we need to ignore the guidance Rails suggests to use a LRU redis cache, as we don't want to have keys automatically cleared out of redis. Instead we just want them to expire naturally. The methods on this class allow adding an item to the cache, removing an item, removing all items. On top of this there are time based methods which are intended to be used to determine when an item was added to cache. This is to enable checking how old something is when deciding whether to add something or not. This is a feature Redis doesn't have by default so it's quite simple to just add this a different entry, the Rails cache entry actually has some info about created time, but I have no idea how to access it and also think it's probably nicer to not have to pull out the whole cache entry to check it. [1]: https://github.com/rails/rails/blob/9926fd86626690746c7ee78174c4d3577bdfa58d/activesupport/lib/active_support/cache/redis_cache_store.rb#L44

This sets up this model to work with data from the Publishing API. It no longer has class methods to look up the model as these are going to be on a different class. The input we expect from the Publishing API is a hash of base_path, content_id, details, locale and title. We can look up started_on, ended_on and current dates from the details hash.

This allows classes that mix this module in to be serialised to a hash. This has been added so that instances that use this module can be serialised to fake cache data.

The purpose of this class is to mediate communication between Publishing API and our cache adapter for governments. It writes entries to the cache using a cache key that is defined as a constant in this class. This is named as "government-v1" to indicate that this is version 1 of the data we are caching. If we are to make any backwards incompatible changes to the data we expect (such as adding an extra mandatory field) then we will need to increment this number. This will ensure that when the code change is deployed the old cache will not be used anymore and will expire naturally. It provides a number of methods to finding a particular government and a method to populate the cache.

This clears the BulkData::Cache after every spec and helpers to populate the cache before tests. Default governments are set up for the most common case where we just need to know about the current and previous governments. It also provides helper methods for current and past government as these are likely to be needed quite frequently.

This creates instances of the GovernmentRepository class whenever we need to load governments in the context of an edition. The factory has been updated to reflect that governments can't be determined without an external dependency and thus governments are always injected. This does raise the prospect of some potential performance concerns for the amount of classes initialised if we are frequently looking up government information. Mostly though I expect these things to not occur multiple times in the same request so I don't think it'll cause any actual performance issues.

This removes unnecessary instance variables and reduces long lines, it also removes an unnecessary test expectation.

By using I18n.t we wouldn't be raising errors if a translation was missing. We don't actually need to call I18n directly as Rails has a view helper of t which roughly correlates to I18n.t!

This field is referred to as a title in the Publishing API

This sets up some default governments in advance of tests so that the government data is populated. These are set-up before all feature and request tests.

By jumping more than 24 hours into the future the cache entries for governments expire, thus the government data needs to be re-stubbed.

As we no longer have past_government and current_government traits this injects these object in instead.

This cuts back on the boilerplate needed to run a worker job on a single box and deny another boxes that try and run this at the same time. This is a situation that occurs with the sidekiq scheduler where every box with a sidekiq process tries to run the job at the same time and the first one to start it wins. We counter this with a PostgreSQL advisory lock which prevents any other clients using the db while this lock is held (they gracefully exit). Ideally we want to run this lock inside a transaction so that if anything goes wrong the lock can be automatically closed, whereas if we run it outside a transaction and something crashes we may need to manually release the lock.

It was more common that we were using Sidekiq::Testing.fake! than relying on Sidekiq::Testing.inline! thus switching to this saves us code. This is also a preferred approach to take since it keeps us more honest by having to explicitly tell Sidekiq to perform actions rather than expecting them to just happen synchronously.

The first_published_at value should always be set on documents that have a live edition. I didn't know a nice way to specify an optional trait to the association method so have ended up with this array monstrosity. If you know a better technique feel free to change it.

This moves the it block outside of a context to be before context as per the more common convention. It also removes the government and political tests from their draft and live context since they aren't resultant effects of that particular context. Finally it makes the draft and live tests more consistent with the other edition state tests by creating an edition rather than a document by factory bot.

This sets up a job that is run every 15 minutes to update the bulk data. If accessing the data fails this job is retried. The schedule is set with an interval of 15 minutes and a delay of 0s which makes the job start immediately whenever the worker process is restarted. As this job is run on a schedule it will execute on all machines at the same time. To counter this the job is run exclusively using the method on the parent class.

This will schedule a background job to load the data if for some reason it's not available in the current environment and a service has tried to use it. There is a risk that this could cause a spike in identical queued jobs were a bunch of users to get this at the same time. This shouldn't present much of a problem as the job only executes one at a time and won't do anything if the data was updated in the last 5 minutes.

If the application is ever in a state where there is not local data unavailable it will return a 503 response and a simple error message. As I anticipate that developers may often be people who see this message (at least while sidekiq isn't run by default with the app) I thought it was useful to have explicit instructions in a developer environment. It was pretty hard to think of the best way to test this so I went with this rather contrived example since documents_show is one of the main routes. I expect that when the BulkData technique is used as part of index page filters we can replace this test with a less contrived one.

This is no longer needed given we are now using the Publishing API

This spec relies on a government existing at the date we specify which caused this test to fail. By creating the government in the test we resolve this complication.

This sets up a local cache that wraps around the Redis cache layer of BulkData::Cache during the duration of a request. In order for this to work in a development environment we need the BulkData::Cache class not to be reloaded automatically. I achieved this by doing direct require calls. I first tried using the Rails autoload_once_paths option but didn't have much luck. As an example, before fixed require calls: ``` irb(main):002:0> BulkData::Cache.object_id => 70342543814920 irb(main):003:0> reload! Reloading... => true irb(main):004:0> BulkData::Cache.object_id => 70342595490220 ``` and after: ``` irb(main):002:0> BulkData::Cache.object_id => 70249028613320 irb(main):004:0> reload! Reloading... => true irb(main):005:0> BulkData::Cache.object_id => 70249028613320 ```

By default the RedisCacheStore uses a fault tolerant approach where it returns nil if it can't access Redis and writes to a logger that we haven't configured [1]. This isn't an ideal configuration for our use case as it's a critical problem if the cache isn't available. To try resolve this we've got 3 reconnect attempts available if we don't manage to connect to Redis or have lost our connection. In the event that we have an error reading from the cache this will report the error to sentry, which will hopefully help make problems easier to diagnose. [1]: https://github.com/rails/rails/blob/09a2979f75c51afb797dd60261a8930f84144af8/activesupport/lib/active_support/cache/redis_cache_store.rb#L472-L485

There's not a whole lot we can do when a dependent service is erroring so lets not report it and wait to see if it causes problems in the app. This is to resolve an issue where we start getting server errors overnight in integration at the point of the data sync.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load government data in an asynchronous task #1522

Load government data in an asynchronous task #1522

Commits on Dec 13, 2019

Commits on Dec 15, 2019

Commits on Dec 16, 2019