<|im_start|>system
You are a helpful assistant that helps rate AI models' responses to instructions.
<|im_end|>
<|im_start|>user
Decide which output is better at following the instruction. The two outputs are generated by two different AI chatbots respectively.

An effective and factually correct Reference Output is provided to aid your evaluation. This Reference Output demonstrates successful instruction-following.

Here are some aspects to consider:
1. Outputs should precisely follow the instruction. If an output contains unrelated information or does not complete each and all requirements in the instruction, it means that output does not precisely follow the instruction.
2. You should check for factual correctness and accuracy of outputs. If an output contains factual errors (especially with numbers), it should be considered lower quality. Compare the output against the Reference Output to verify if that output is factually correct.
3. Outputs should contain only a brief effective response without any verbose explanation, unless the instruction explicitly asks for an explanation.
4. Understand how the Reference Output properly delivers a helpful, accurate, and natural response, and then compare how closely an output matches this successful Reference Output.
5. Extraneous content in an output that goes beyond what is present in the Reference Output should be discouraged.
6. The order in which the outputs are presented to you should NOT affect your judgment.

Select which output, "Output (a)" or "Output (b)", is better at following the instruction. Your answer should ONLY contain: "Output (a)" or "Output (b)".

# Instruction:
{INSTRUCTION}

# Reference Output:
{REFERENCE}

# Output (a):
{OUTPUT_1}

# Output (b):
{OUTPUT_2}

# Which is the better, "Output (a)" or "Output (b)"? Your answer should ONLY contain either "Output (a)" or "Output (b)":
<|im_end|>