Normalization

ranx provides several result lists normalization strategies to be used conjunctly with the fusion methods. Normalization aims at transforming the scores of a result list into new values to make them comparable with those of other normalized result lists, which is mandatory for correctly applying many of the provided fusion methods. The normalization strategy to apply before fusion can be defined through the norm parameter of the functions fuse and optimize_fusion (defaults to min-max).

Normalization Strategies	Alias
Min-Max Norm	min-max
[Min-Max-Inverted Norm][min-max-norm-inverted]	min-max-inverted
Max Norm	max
Sum Norm	sum
ZMUV Norm	zmuv
Rank Norm	rank
Borda Norm	borda

Min-Max Norm

Min-Max Norm scales the scores (s) of a result list between 0 and 1, scaling to 0 the minimum score (\(s_{min}\)) and 1 the maximum score (\(s_{max}\)).

\[ \operatorname{MinMaxNorm(s)}=\frac{s - s_{min}}{s_{max} - s_{min}} \]

Min-Max Inverted Norm

Min-Max Inverted Norm scales the scores (s) of a result list between 0 and 1, scaling to 1 the minimum score (\(s_{min}\)) and 0 the maximum score (\(s_{max}\)). It is handy when distance metrics are used to compute relevance scores, i.e. when lower scores indicates higher relevance.

\[ \operatorname{MinMaxInvertedNorm(s)}=\frac{s_{max} - s}{s_{max} - s_{min}} \]

Max Norm

Max Norm scales the scores (s) of a result list the maximum score (\(s_{max}\)) is scaled to 1.

\[ \operatorname{MaxNorm(s)}=\frac{s}{s_{max}} \]

Sum Norm

Sum Norm scales the minimum score (\(s_{min}\)) to 0 and the scores sum to 1. It is computed as follows:

\[ \operatorname{SumNorm(s)}=\frac{s - s_{min}}{\sum_s{s - s_{min}}} \]

ZMUV Norm

ZMUV Norm (zero-mean, unit-variance) scales the scores so that their mean (\(s_{mean}\)) becomes zero and their variance 1.

\[ \operatorname{ZMUVNorm(s)}=\frac{s - s_{mean}}{s_{std}} \]

Rank Norm

Rank Norm transforms the scores according to the position in the ranking of the results they are associated with. In this case, the normalized scores are uniformly distributed. The top-ranked result gets a score of 1, while the bottom-ranked result gets a score of \(\frac{1}{|r|}\), where \(|r|\) is the number of results in the ranked list.

\[ \operatorname{RankNorm(s_i)}=1-\frac{r_i - 1}{|r|} \]

Borda Norm

Borda Norm transforms the scores in a similar manner of how BordaFuse assign points to the results before fusing multiple runs. Borda Norm is defined as follows:

\[ \operatorname{BordaNorm(s_i)}= \begin{cases} 1 - \frac{r_i - 1}{|candidates|} & \mathit{if}\ d \in r \\ \frac{1}{2} - \frac{|r|-1}{2 \cdot |candidates|} & \mathit{otherwise} \end{cases} \]

Please, refer to Renda et al. for further details.