Inverse Document Frequency affects the score.
This means that a document coming from a big collection can obtain a boost from IDF, in comparison to a similar document from a smaller collection.
This is because the maxDoc count is taken into account as corpus size, so even if a term has the same document frequency, IDF will be strongly affected by the collection size.
Distributed IDF [3] partially solved the problem :
When distributing the search across different shards of the same collection, it works quite well.
But using the ExactStatCache and alternating single collection distribution and multi collection distribution in the same SolrCloud cluster will create some caching conflict.
Specifically if we first execute the inter collection query, the global stats cached will be the inter collection global stats, so if we then execute a single collection distributed search, the preview global stats will remain cached ( viceversa applies).