Evaluate
evaluate(qrels, run, metrics, return_mean=True, return_std=False, threads=0, save_results_in_run=True, make_comparable=False)
Compute the performance scores for the provided qrels
and run
for all the specified metrics.
Usage examples:
from ranx import evaluate
Compute score for a single metric
evaluate(qrels, run, "ndcg@5")
0.7861
Compute scores for multiple metrics at once
evaluate(qrels, run, ["map@5", "mrr"])
{"map@5": 0.6416, "mrr": 0.75}
Computed metric scores are saved in the Run object
run.mean_scores
{"ndcg@5": 0.7861, "map@5": 0.6416, "mrr": 0.75}
Access scores for each query
dict(run.scores)
{ ... "ndcg@5": {"q_1": 0.9430, "q_2": 0.6292}, ... "map@5": {"q_1": 0.8333, "q_2": 0.4500}, ... "mrr": {"q_1": 1.0000, "q_2": 0.5000}, ... } Args: qrels (Union[ Qrels, Dict[str, Dict[str, Number]], nb.typed.typedlist.List, np.ndarray, ]): Qrels. run (Union[ Run, Dict[str, Dict[str, Number]], nb.typed.typedlist.List, np.ndarray, ]): Run. metrics (Union[List[str], str]): Metrics or list of metric to compute. return_mean (bool, optional): Whether to return the metric scores averaged over the query set or the scores for individual queries. Defaults to True. threads (int, optional): Number of threads to use, zero means all the available threads. Defaults to 0. save_results_in_run (bool, optional): Save metric scores for each query in the input
run
. Defaults to True. make_comparable (bool, optional): Adds empty results for queries missing from the run and removes those not appearing in qrels. Defaults to False.
Returns:
Type | Description |
---|---|
Union[Dict[str, float], float]
|
Union[Dict[str, float], float]: Results. |
Source code in ranx/meta/evaluate.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
|