This will be a short entry, but it's something I haven't seen concisely covered elsewhere. I often use Elasticsearch for search and analytics in my Rails apps, but testing with it is tricky. The problem is that Elasticsearch only seem to automatically refresh indexes once per second, meaning a document sent to Elasticsearch for indexing might not be immediately available. This is particularly an issue in instances like the following:
# REALLY BAD
# Using FactoryGirl, Rspec, and Elasticsearch model (https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model)
user = create(:user) # Create user with FactoryGirl
user.__elasticsearch__.index_document # Can take up to 1 second to be searchable due to automatic index refreshing
expect(User.search.results.map(&:_id)).to include(user.id.to_s) # Race condition because of 1-second gap
It's normally not an issue if a document is unavailable in search for a second or two, but this will cause a number of tests to sporadically fail in Rails tests. The naive fix for this (that I originally used before I understood the problem fully) is the following:
# BAD
user = create(:user)
user.__elasticsearch__.index_document
sleep(1.2) # Sleep slightly longer than a second to ensure that Elasticsearch has time to refresh index
expect(User.search.results.map(&:_id)).to include(user.id.to_s)
The problem with this is that it drastically slows down any test that relies on search and it can still result in failed tests when Elasticsearch takes awhile to refresh the index. Really, you should never need to call sleep()
, but my initial hunt for a solution came up empty for a better solution. After more digging, I came across a much better approach using the refresh_index!
method:
# BETTER
user = create(:user)
user.__elasticsearch__.index_document
User.__elasticsearch__.refresh_index! # Manually refresh the index instead of waiting
expect(User.search.results.map(&:_id)).to include(user.id.to_s)
This makes testing faster and more reliable for Elasticsearch. I initially expected the refresh_index
method to be asynchronous as well, but I have yet to have a test fail because of it. No need to ever rely on sleep()
again!