LLM Position Bias Benchmark: Swapped-Order Pairwise Judging
The LLM Position Bias Benchmark introduces a swapped-order pairwise judging method to measure position bias in large language models. This approach helps quantify how model preferences change when the order of options is reversed in pairwise comparisons.