This gem was created to remove the dependency of indices being tied 1-1 to an ActiveModel class by implementing an 'indexing strategies' pattern, allowing for much more flexibility on the design of your Elasticsearch indices.
- A model can contain one or more 'strategies' (maybe you want to analize/search data in different ways?)
- A strategy can involve one or more Elasticsearch indices (maybe you want to index data by year?)
- A single index can even contain data from multiple models!
The best thing is that we manage to do this with small impact on your models and a clean mental model by keeping all the business logic inside your strategy classes.
The creation of this gem was inspired on Elasticsearch's own elasticsearch-model gem.
Add this line to your application's Gemfile:
gem 'elasticsearch_model_repositories'
And then execute:
bundle
Or install it yourself as:
gem install elasticsearch_model_repositories
Lets create our first indexing strategy by extending ElasticsearchRepositories::BaseStrategy
class Simple < ElasticsearchRepositories::BaseStrategy
# The index_name for a specific record (used for update/destroy)
def target_index_name(record)
search_index_name
end
# The name of the index when searching data
def search_index_name
host_class._base_index_name # in this case, host_class is Person
end
# The name of the index to use when creating a new record (create)
def current_index_name
search_index_name
end
end
Here is a list of methods that you may want to consider overriding on your custom strategy:
- target_index_name
- search_index_name
- current_index_name
- reload_indices_iterator
- index_without_id
- as_indexed_json
- index_record_to_es
- reindexing_includes_proc
Now lets register this strategy into a model. It is important to add the following lines to your model that provide the methods for registering strategies, indexing, reindexing, and some other utilities.
include ElasticsearchRepositories::Model
class Person
include ElasticsearchRepositories::Model
after_commit -> (record) {
call_indexing_methods('create', record)
}, on: :create
after_commit -> (record) {
call_indexing_methods('update', record)
}, on: :update
after_commit -> (record) {
call_indexing_methods('delete', record)
}, on: :destroy
# You can register as many strategies as you want to a model.
# By adding code into the block, you can override methods for this specific
# strategy instance
register_strategy Simple do
set_mappings(dynamic: 'true') do
# your mappings for this class
end
def as_indexed_json(record)
# customize serialization for this class
super.merge(record.as_json)
end
end
# Define this method in case you want to customize index naming
def self._base_index_name
"#{Rails.env}_#{self.name.underscore.dasherize.pluralize}_"
end
private
# This method is called by the gem and needs to be implemented
# Here is where you actually index to ES, you may call a Sidekiq worker
# that asyncronously indexes the record.
def index_document(strategy, action, **options)
SomeIndexerWorker.perform_later(
self,
strategy.name,
action,
options
)
end
def call_indexing_methods(event_name, record)
self.class.indexing_strategies.each do |strategy|
strategy.index_record_to_es(event_name, record)
end
end
end
And there we have it, we have registered our Simple
strategy into our model and used ActiveRecord's hooks to call the indexing methods.
To keep our code DRY lets create a concern that can be added to our models that we want to index to elasticsearch.
module Searchable
class_methods do
include ElasticsearchRepositories::Model::ClassMethods
def self._base_index_name
"#{Rails.env}_#{self.name.underscore.dasherize.pluralize}_"
end
end
included do
include ElasticsearchRepositories::Model::InstanceMethods
after_commit -> (record) {
call_indexing_methods('create', record)
}, on: :create
after_commit -> (record) {
call_indexing_methods('update', record)
}, on: :update
after_commit -> (record) {
call_indexing_methods('delete', record)
}, on: :destroy
private
def call_indexing_methods(event_name, record)
self.class.indexing_strategies.each do |strategy|
strategy.index_record_to_es(event_name, record)
end
end
def index_document(strategy, action, **options)
SomeIndexerWorker.perform_later(
self,
strategy.name,
action,
options
)
end
end
end
We then can add this concern to our model
class Person
include Searchable
register_strategy Simple do
set_mappings(dynamic: 'true') do
# your mappings for this class
end
def as_indexed_json(record)
# customize serialization for this class
super.merge(record.as_json)
end
end
end
Now that we have our model correctly setup, lets explore some of the methods we have available.
# get the instance of the strategy for the class
simple_indexing_strategy = Person.indexing_strategies.first
#or
simple_indexing_strategy = Person.default_indexing_strategy
##########
#indexing#
##########
Person.create({name: 'Yoda'}) # this will create the record on ES with all registered strategies
Person.update({name: 'John'}) # this will update the record on ES with all registered strategies
Person.destroy # this will delete the record from ES with all registered strategies
Person.first.index_with_all_strategies # will index (create/update) the document with all registered strategies
simple_indexing_strategy.index_record_to_es('update', Person.first) # to only this specific strategy
# to serialize the record with a specific strategy
simple_indexing_strategy.as_indexed_json(Person.first) # returns the serialized json
simple_indexing_strategy.search({query: {match_all: {}}})
#reload all registered indices
Person.reload_indices(force: true)
Now lets create an indexing strategy that uses yearly indices
class Yearly < ElasticsearchRepositories::BaseStrategy
#############
#index naming
#############
# Returns an index name for dated indices
def _build_dated_index_name(date)
host_class._base_index_name + "#{date.year}"
end
# When a search is executed, replace everything after the first digit on the index_name
# with * so all the data is reachable.
def search_index_name
_build_dated_index_name(Time.now).sub(/\d.*/, '*')
end
# Returns the index name of a particular db record
def target_index_name(record)
_build_dated_index_name(record.created_at)
end
def current_index_name
_build_dated_index_name(Time.now)
end
##########
#importing
##########
# Since we have special business logic of how to index the data (divided into yearly indices) we need to override how the data gets reindexed.
def reload_indices_iterator(start_time = nil, end_time = nil)
start_time = (start_time || host_class.minimum('created_at'))&.beginning_of_year
end_time = (end_time || host_class.maximum('created_at'))&.end_of_year
if start_time && end_time
number_of_years = (end_time.year)-(start_time.year) + 1
(0...number_of_years).each do |month|
_start = start_time + month.years
_end = _start + 1.years
# this is the important part, you need to yield the following arguments:
#
# 1) AR relation object with query to fetch records to import
# 2) reindexing options hash
yield(
host_class.where('created_at >= ? and created_at < ?', _start, _end),
{
index: _build_dated_index_name(_start),
settings: settings.to_hash,
mappings: mappings.to_hash,
verify_count_query: { query: { bool:{ filter: [ range: { created_at: { gte: _start, lt: _end } } ] } } }
}
)
end
end
end
end
Now lets add this Yearly strategy into our model
class Person
include Searchable
register_strategy Simple do
set_mappings(dynamic: 'true') do
# your mappings for this class
end
def as_indexed_json(record)
# customize serialization for this class
super.merge(record.as_json)
end
end
register_strategy Yearly do
set_mappings(dynamic: 'true') do
# your mappings for this class
end
def as_indexed_json(record)
# customize serialization for this class
super.merge(record.as_json)
end
end
end
location_response = ElasticsearchRepositories.search(
{query: {match_all: {}}}, #elasticsearch query
[Person].map(&:default_indexing_strategy) #array of strategies to search
)
You can register callbacks on an initializer powered by ActiveSupport::Callbacks
ElasticsearchRepositories::SearchRequest.set_callback :execute, :before do |search_request|
# do something
end
Instead of overriding methods with def target_index_name ... end
you can
return a hash on the configure
class method. This is simply to make it
clear which methods are being overriden and which are custom.
class Simple < ElasticsearchRepositories::BaseStrategy
configure {
target_index_name: ->(record) {
build_dated_index_name(record.created_at)
},
search_index_name: ->(record) {
build_dated_index_name(Time.now).sub(/\d.*/, '*')
},
current_index_name: ->(record) {
build_dated_index_name(Time.now)
}
}
def custom_method
end
end
class YourModel
register_strategy Searchable::Strategies::Simple do
set_mappings({dynamic: 'false'}) do
...
end
configure_instance {
as_indexed_json: ->(record) {
super(record)
},
...
}
end
end
After checking out the repo, run bin/setup
to install dependencies. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/elasticsearch_model_repositories.