Apache Solr Distributed Facets

Apache Solr distributed faceting feature has been introduced back in 2008 with the first versions of Solr (1.3 according to this jira[1]) .
Until now, I always assumed it just worked, without diving too much into the details.
Nowadays distributed search and faceting are extremely popular, you can find them pretty much everywhere (in the legacy or SolrCloud form alike).
N.B. Although the mechanics are pretty much the same, Json faceting revisits this approach with some change, so we will now focus on legacy field faceting.

I think it’s time to get a better understanding of how it works:

Multiple Shard Requests

When dealing with distributed search and distributed aggregation calculations, you are going to see multiple requests going back and forth across the shards.
They have different focus and are meant to retrieve the different bits of information necessary to build the final response.
We are going to explore the different rounds of requests, focusing just for the faceting purpose.
N.B. Some of these requests are also carrying results for the distributed search calculation, this is used to minimise the network traffic.

For the sake of this blog let’s simulate a simple sharded index, white space tokenization on field1 and facet.field=field1

Shard 1 Shard 2
Doc0
{  “id”:”1”,
“field1”:”a b”
}
Doc3
{  “id”:”4”,
“field1”:”b c”
}
Doc1
{  “id”:”2”,
“field1”:”a”
}
Doc4
{  “id”:”5”,
“field1”:”b c”
}
Doc2
{  “id”:”3”,
“field1”:”b c”
}
Doc53
{  “id”:”6”,
“field1”:”c”
}

Global Facets : b(4), c(4), a(2)

Shard 1 Local Facets : a(2), b(2), c(1)

Shard 2 Local Facets : c(3), b(2)

Collection of Candidate Facet Field Values

The first round of requests is sent to each shard to identify the candidate top K global facet values.
To achieve this target each shard will be requested to respond with its local top K+J facet values and counts.
The reason we actually ask for more facets from each shard is to have a better term coverage, to avoid losing relevant facet values and to minimise the refinement requests.
How many more we request from each shard is regulated by the “overrequest” facet parameter, a factor that gives more accurate facets at the cost of additional computations[2].
Let’s assume we configure a facet.limit=2&facet.overrequest.count=0&facet.overrequest.ratio=1 to explain when refinement happens and how it works.

Shard 1 Returned Facets : a(2), b(2)

Shard 2 Returned Facets : c(3), b(2)

Global Merge of Collected Counts

The facet value counts collected from each shard are merged and the most occurring global top K is calculated.
These facet field values are the first candidates to be the final ones.
In addition to that, other candidates are extracted from the terms below the top K, based on the shards that didn’t return those values statistics.
At this point we have a candidate set of values and we are ready to refine their counts where necessary, asking back this information to the shards that didn’t include that in the first round.
This happens including the following specific facet parameter to the following refinement requests:

{!terms=$<field>__terms}<field>&<field>__terms=<values>
e.g.
{!terms=$field1__terms}field1&field1__terms=term1,term2

N.B. This request is specifically asking a Solr instance to return back the facet counts just for the terms specified[3]

Top 2 candidates = b(4), c(3)
Additional candidates = a(2)

The reason that a(2) is added to the potential candidates is because Shard 2 didn’t answer with a count for a, the potential missing count of 1 could bring a to the top K. So it is worth a verification.

Shard 1 didn’t return any value for the candidate c facet.
So the following request is built and sent to it:
facet.field={!terms=$field1__terms}field1&field1__terms=c

Shard 2 didn’t return any value for the candidate a facet.
So the following request is built and sent to it:
facet.field={!terms=$field1__terms}field1&field1__terms=a

Final Counts Refinement

The refinements counts returned by each shard can be used to finalise the global candidate facet values counts and to identify the final top K to be returned by the distributed request.
We are finally done!

Shard 1 Refinements Facets : c(1)

Shard 2 Refinements Facets : a(0)

Top K candidates updatedb(4), c(4), a(2)

GIven a facet.limit=2 the final global facets with correct results returned is :
b(4), c(4)

 

[1] https://issues.apache.org/jira/browse/SOLR-303

[2] https://lucene.apache.org/solr/guide/6_6/faceting.html#Faceting-Over-RequestParameters

[3] https://lucene.apache.org/solr/guide/7_5/faceting.html#limiting-facet-with-certain-terms

SolrCloud exceptions with Apache Zookeeper

At the time we speak ( Solr 7.3.1 ) SolrCloud is a reliable and stable distributed architecture for Apache Solr.
But it is not perfect and failures happen.

Apache Zookeeper[1] is the system responsible of managing the communications across the SolrCloud cluster.
It contains the shared collections configurations and it has the view of the cluster status.
It is part of the brain of the cluster, a keeper that maintains the cluster healthy and functional.

It is able to answer questions such as :

• Who is the leader for this shard and collection?
• Is this node down ?
• Is this node recovering ?

The Solr nodes communicate with Zookeeper to understand who to contact when running SolrCloud operations.

This lightening blog post will present some practical tips to follow when your client application encounters some classic exceptions dealing with SolrCloud and Apache Zookeeper.
Special thanks to the Apache Solr user mailing list contributors and the Apache Solr community, this post is an aggregation of recommendations from there and from official code and documentation.

org.apache.solr.common.SolrException: Could not load collection from ZK: <collection name>

If you landed here with just that Exception I assume there is a missing :
“ Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /collections/<collection name>/state.json “ ?

Solr’s zkClientTimeout is used to set ZooKeeper’s sessionTimeout, and that’s what is exceeded when a session expires.
When this kind of exception happens, it means something has gone VERY wrong in the Solr-Zookeeper communication, 30 seconds ( the current default[2] ) is a REALLY long time when applications are trying to communicate.

Recommendation : take care of the different time outs around, don’t keep them too small !
i.e. for zkClientTimeout assign a value >= 30 seconds.
maxSessionTimeout (Zookeeper)
New in 3.3.0: the maximum session timeout in milliseconds that the server will allow the client to negotiate. Defaults to 20 times the tickTime.

zkClientTimeout (Solr)
Controls your client timeout.

Once checked the time outs, let’s explore some possible root causes.
A session expiry can be caused by:
1. Garbage collection on Solr node/Zookeeper – extreme GC pauses can happen with the heap being too small or VERY large
2. Slow IO on disk.
3. Network latency

Recommendations

  1. set up a JVM profiler to monitor closely your Solr and Zookeeper nodes, take particular attention to the garbage collection cycles and the memory usage in general : you don’t want Zk to swap too much! ( GCViewer[3] could be a nice tool for this)
  2. Verify that the Zookeeper node has a fast writing access to the disk : Zookeeper needs fast writes and ideally a separate disk allocated.
  3. Monitor your network and make sure the the solr nodes can talk effectively to the Zookeeper nodes

In case the suggestions are not solving your problem, you may be experiencing a Solr bug.
One of them is[4] which unfortunately has not been fixed yet.

org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this

From the official JavaDoc :

org/apache/solr/client/solrj/impl/LBHttpSolrClient.java:369
“Tries to query a live server from the list provided in Req. Servers in the dead pool are skipped.
* If a request fails due to an IOException, the server is moved to the dead pool for a certain period of
* time, or until a test request on that server succeeds.
*
* Servers are queried in the exact order given (except servers currently in the dead pool are skipped).
* If no live servers from the provided list remain to be tried, a number of previously skipped dead servers will be tried.
* Req.getNumDeadServersToTry() controls how many dead servers will be tried.
*
* If no live servers are found a SolrServerException is thrown.”

What was the status of the cluster at the moment the exception happened ?
Was any Solr server UP and running according to Zookeeper knowledge ?

The recommendation is to check the clusterstate.json when the exception happens.
From the Solr admin UI you can open Cloud->Tree and verify which nodes are up and running.

It could be very much related a node failure ( that could be related to any possible reason including GC)
I’ve seen situations where it was caused by a specific query, the real exception got hidden by a “No live SolrServers…” client exception.
Solr logs should help to identify the inner Solr problem and JVM monitoring could discard any memory/gc problem.
Some people saw this with wildcard queries (when every shard reported a “too many expansions…”
type error, but the exception in the client response was “No live SolrServers…”.

org.apache.solr.common.SolrException: Could not find a healthy node to handle the request

Pretty much same considerations as the “No Live Solr Server”.
This happens when the load balancer SolrJ side is unable to retrieve an alive node, from the cluster ( based on Zookeeper state).
This happens before the previous exception, so the request doesn’t even reach the LoadBalancinghttpSolrClient.

[1] https://zookeeper.apache.org
[2] SOLR-5565
[3] https://github.com/chewiebug/GCViewer
[4] SOLR-8868

SolrCloud Leader Election Failing

At the time we speak ( Solr 7.3.0 ) SolrCloud is a reliable and stable distributed architecture for Apache Solr.
But it is not perfect and failures happen.
This lightening blog post will present some practical tips to follow when a specific shard of a collection is down with no leader and the situation is stuck.
The following problem has been experienced with the following Solr versions :

  • 4.10.2
  • 5.4.0

Steps to solve the problem may involve manual interaction with the Zookeeper Ensemble[1].
The following steps are extracted from an interesting thread of the Solr User mailing list[2] and practical experience on the field.
In particular, thanks to Jeff Wartes for the suggestions, that proved useful for me in a couple of occasions.

Problem

  • All nodes for a Shard in a Collection are up and running
  • There is no leader for the shard
  • All the nodes are in a “Recovering” / “Recovery Failed” state
  • Search is down and the situation persist after many minutes (> 5)

Solution

A possible explanation for this problem to occur is when the node-local version of the Zookeeper clusterstate has diverged from the centralized Zookeeper cluster state.
One possible cause for the leader election to break is a Zookeeper failure : for example you lose >=50% of the ensemble nodes or the connectivity among the ensemble nodes for a certain period of time ( this is the scenario I experimented directly)
This failure, even if resolved later, can bring a corruption to the Zookeeper file system.
Some of the SolrCloud collections may remain in a not consistent status.

It may be necessary to manually delete corrupted files from Zookeeper :
Let’s start from :

collections/<collection>/leader_elect/shard<x>/election
An healthy SolrCloud cluster presents as many core_nodeX as the total replicas for the shard.
You don’t want duplicates or missing nodes here.
If you’re having trouble getting a sane election, you can try deleting the lowest-numbered entries (as well as any lower-numbered duplicates) and try to foce the election again. Possibly followed by restarting the node with that lowest-numbered entry.

collections/<collection>/leader/shard<x>
Make sure that this folder exists and has the expected replica as a leader.

collections/<collection>/leader_initiated_recovery
This folder can be informative too, this represents replicas that the *leader* thinks are out of sync, usually due to a failed update request.

After having completed the verification above, there a couple of Collection API endpoints that may be useful :

Force Leader Election
/admin/collections?action=FORCELEADER&collection=<collectionName>&shard=<shardName>

Force Leader Rebalance
/admin/collections?action=REBALANCELEADERS&collection=collectionName

N.B. rebalancing all the leader will affect all the shards

 

[1] Apache Zookeeper Solr Cli

[2] Solr Mailing List Thread

[3] Solr Collection API

 

Distributed Search Tips for Apache Solr

Distributed search is the foundation for Apache Solr Scalability :

It’s possible to distributed search across different Apache Solr nodes of the same collection ( both in a  legacy[1] or SolrCloud[2] architecture), but it is also possible to distribute search across different collections in a SolrCloud cluster.
Aggregating results from different collections may be useful when you put in place different systems ( that were meant to be separate ) and you later realize that aggregating the results may be an additional useful use case.
This blog will focus on some tricky situations that can happen when running distributed search ( for configuration or details you can refer to the Solr wiki ).

IDF

Inverse Document Frequency affects the score.
This means that a document coming from a big collection can obtain a boost from IDF, in comparison to a similar document from a smaller collection.
This is because the maxDoc count is taken into account as corpus size, so even if a term has the same document frequency, IDF will be strongly affected by the collection size.
Distributed IDF[3] partially solved the problem :

When distributing the search across different shards of the same collection, it works quite well.
But using the ExactStatCache and alternating single collection distribution and multi collection distribution in the same SolrCloud cluster will create some caching conflict.

Specifically if we first execute the inter collection query, the global stats cached will be the inter collection global stats,  so if we then execute a single collection distributed search, the preview global stats will remain cached ( viceversa applies).

Debug Scoring

Real score and debug score is not aligned with the distributed IDF, this means that the debug query will not show the correct distributed IDF and correct scoring calculus for distributed searches[4].

Relevancy tuning

Lucene/Solr score is not probabilistic or normalised.
For the same collection we can have completely different score scales just with different queries.
The situation becomes more complicated when we tune our relevancy adding multiplicative or additive boosts.
Different collections may imply completely different boosting logic that could cause the score of a collection to be on a completely different scale in comparison to another.
We need to be extra careful when tuning relevancy for searches across different collections and try to configure the distributed request handler in the most compatible way as possible.

Request handler

It is important to carefully specify the request handler to be used when using distributed search.
The request will hit one collection in one node and then when distributing the same request handler will be called on the other collections across the other nodes.
If necessary it is possible to configure the aggregator request handler and local request handlers ( this may be useful if we want to use a different scoring formulas per collection, using local parameters) :

Aggregator Request Handler

It is executed on the first node receiving the request.
It will distribute the request and then aggregate the results.
It must describe parameters that are in common across all the collections involved in the search.
It is the one specified in the main request.
e.g.
http://localhost:8983/solr/collection1/select?

Local Request Handler

It is specified passing the parameter : shards.qt=
It is execute on each node that receive the distributed query AFTER the first one.
This can be used to use specific fields or parameter on a per collection basis.
A local request handler may use fields and search components that are local to the collection interested.

e.g.
http://localhost:8983/solr/collection1/select?q=*:*&collection=collection1,collection2&shards.qt=localSelect

N.B. the use of local request handler may be useful in case you want to define local query parser rules, such as local edismax configuration to affect the score.

Unique Key

The unique key field must be the same across the different collections.
Furthermore the value should be unique across the different collections to guarantee a proper behaviour.
If we don’t comply with this rule, Solr will fail in aggregating the results and raise an exception.

[1] https://lucene.apache.org/solr/guide/6_6/distributed-search-with-index-sharding.html
[2] https://lucene.apache.org/solr/guide/6_6/solrcloud.html
[3] https://lucene.apache.org/solr/guide/6_6/distributed-requests.html#DistributedRequests-ConfiguringstatsCache_DistributedIDF_
[4] https://issues.apache.org/jira/browse/SOLR-7759